Java Forum / General / January 2008
Faster than Array search?
Sanny - 14 Jan 2008 09:18 GMT I have a piece of code where i need to search from an Array list
For Example
char[] array1= new char[100];
for (int i=0;i<100000;i++){
int hh=function1(i); if (array1[hh]=='K') ff+=function2(hh); }
Here I found most of the time is spent in evaluating the if condition
if (array1[hh]=='K').
How can this char array be searched faster than finding hh position equal to 'K' or not.
Can I use String / BufferString to increase the speed of above evaluation?
Will using Vector / List improve or can I use Set. What will be the fastest way to replace this condition statement. if (array1[hh]=='K').
Bye Sanny
Roedy Green - 14 Jan 2008 09:22 GMT On Mon, 14 Jan 2008 01:18:00 -0800 (PST), Sanny <softtanks@hotmail.com> wrote, quoted or indirectly quoted someone who said :
>How can this char array be searched faster than finding hh position >equal to 'K' or not. see http://mindprod.com/jgloss/binarysearch.html
http://mindprod.com/jgloss/hashmap.html
You might even be able use 'K' (perhaps modified with an offset) as an index into an array for an extremely fast lookup.
e.g.. header[ letter - 'A' ];
 Signature Roedy Green, Canadian Mind Products The Java Glossary, http://mindprod.com
Sanny - 14 Jan 2008 11:07 GMT On Jan 14, 2:22 pm, Roedy Green <see_webs...@mindprod.com.invalid> wrote:
> On Mon, 14 Jan 2008 01:18:00 -0800 (PST), Sanny > <softta...@hotmail.com> wrote, quoted or indirectly quoted someone who [quoted text clipped - 6 lines] > > http://mindprod.com/jgloss/hashmap.html I am using Array char[] array1= new char[100]; This array is not sorted So how can I use Binary search?
> You might even be able use 'K' (perhaps modified with an offset) as an > index into an array for an extremely fast lookup. > > e.g.. header[ letter - 'A' ]; if (array1[]=='K') here 'K' is not unique It may happen more that 2 points have value 'K'.
Array1={'a','m', 'x', 'K', 'L', 'L', 'K', 'K'...........'X','b', 'a'} So same character may be present at many places in the array.
Bye Sanny
Patricia Shanahan - 14 Jan 2008 13:44 GMT ...
> if (array1[]=='K') here 'K' is not unique It may happen more that 2 > points have value 'K'. ...
Use an array with the indexing based on the char value, and with an List<Integer> as value. The code for scanning all elements with value "K" becomes:
for(Integer index : lookupArray["K"+128]){ ... }
Patricia
Roedy Green - 14 Jan 2008 14:45 GMT On Mon, 14 Jan 2008 03:07:27 -0800 (PST), Sanny <softtanks@hotmail.com> wrote, quoted or indirectly quoted someone who said :
>I am using Array char[] array1= new char[100]; >This array is not sorted So how can I use Binary search? you have to sort it first. See http://mindprod.com/jgloss/sort.html
 Signature Roedy Green, Canadian Mind Products The Java Glossary, http://mindprod.com
Roedy Green - 14 Jan 2008 14:52 GMT On Mon, 14 Jan 2008 03:07:27 -0800 (PST), Sanny <softtanks@hotmail.com> wrote, quoted or indirectly quoted someone who said :
>if (array1[]=='K') here 'K' is not unique It may happen more that 2 >points have value 'K'. > >Array1={'a','m', 'x', 'K', 'L', 'L', 'K', 'K'...........'X','b', 'a'} >So same character may be present at many places in the array. then simple indexing won't work. What you could do is have an array indexed by a-z A-Z. At that index is a tiny array giving you the list of elements that match that letter.
You then have to compute the index like this:
final char letter = ???; final int index; if ( 'a' <= letter && letter <= 'z' ) { index = letter - 'a'; } else if ( 'A' <= letter && letter <= 'Z' ) { index = letter- 'Z' + 26; } else { throw new IllegalArgumentException ( "oops bad letter " + letter ); }
 Signature Roedy Green, Canadian Mind Products The Java Glossary, http://mindprod.com
Martin Gregorie - 14 Jan 2008 22:01 GMT > On Mon, 14 Jan 2008 03:07:27 -0800 (PST), Sanny > <softtanks@hotmail.com> wrote, quoted or indirectly quoted someone who [quoted text clipped - 11 lines] > > You then have to compute the index like this: If I understand Sanny's code correctly, it can be paraphrased as:
1) do some magic in function1() that returns an integer value called 'hh' in the range 0..100 2) if the value has one of a set of arbitrary values then pass 'hh' to function2() which does more magic and returns some value which is added to a variable called 'ff'
The array is never searched.
The test for the arbitrary value is to use 'hh' as a direct index into an array that contains an element corresponding to each of the 100 possible values of 'hh' and to compare the indexed element with a literal constant.
AFAIK direct indexing is the fastest way to access an array, so something else must be eating up the CPU cycles.
Sanny, some questions:
- presumably you're using some sort of profiler, so does it measure time spent in a code clause, or just in the entire line?
- how do you know that the wait time is in the condition? It matters whether the profiling tool is actually measuring the time spent in each statement with a nanosecond timer or if it merely looks at your program every few microseconds and increments an array indexed on the line number of the statement that the instruction pointer happens to be in. The second type used to be most common and probably still are.
If it does the second, then it can be fooled in some circumstances and, while it may be OK for saying how much time is spent in your loop, its less reliable for saying exactly which statement(s) in the loop are eating the CPU cycles.
- do you get the same answer if you rewrite the line like:
if (array1[hh]=='K') { ff+=function2(hh); }
which would let a line or source statement oriented profiler distinguish time spent in function2 from time in the condition test?
Using a String, StringBuffer, Vector or List could be as fast as directly indexing an array but it is very unlikely to be faster because:
- String.charAt() and StringBuffer.charAt() involve method calls, so they will be slower even if their internal implementation is an indexed array access.
- Vectors and Lists are likely to be implemented as more complex structures than mere arrays. Accessing them consists of a method call plus a more complex, and hence slower, access procedure than an indexed array access.
- Sets are unsuitable because they can't hold duplicate values. They also add overheads: HashSet has to be slower than indexing because there's an overhead in calculating the hash and the TreeSet has a tree walking overhead.
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
tzvika.barenholz@gmail.com - 14 Jan 2008 12:28 GMT > I have a piece of code where i need to search from an Array list > [quoted text clipped - 25 lines] > Bye > Sanny The short answer is no, I think. because iteration over the list will not be quicker than the array reference operator [], and the == comparison between chars will be necessary anyway, within String or any other character sequence.
T
ownowl - 14 Jan 2008 12:35 GMT Sanny a écrit :
> I have a piece of code where i need to search from an Array list > [quoted text clipped - 24 lines] > Bye > Sanny Hi It depends of the king of data, but mayby you could use a tree instead array
Olivier
Sanny - 14 Jan 2008 16:00 GMT > Sanny a écrit : > [quoted text clipped - 33 lines] > > - Show quoted text - Does java support tree?
For Tree one needs pointers. I know in C++ tree can be made how that in Java?
Bye Sanny
Patricia Shanahan - 14 Jan 2008 16:40 GMT ...
> Does java support tree? > > For Tree one needs pointers. I know in C++ tree can be made how that > in Java? ...
Given the extreme extent to which Java is based on pointers, is should be unsurprising that it has tree implementations. See, for example, java.util.TreeMap.
(Any non-null Java reference is a pointer to some object, and those pointers are the only way to access Java objects.)
However, array access tends to be faster, and for this sort of lookup by letter, the array will be small enough to be practical.
Patricia
Roedy Green - 14 Jan 2008 17:07 GMT On Mon, 14 Jan 2008 08:00:33 -0800 (PST), Sanny <softtanks@hotmail.com> wrote, quoted or indirectly quoted someone who said :
>For Tree one needs pointers. I know in C++ tree can be made how that >in Java? Java has safe pointers called "references". see http://mindprod.com/jgloss/pointer.html
 Signature Roedy Green, Canadian Mind Products The Java Glossary, http://mindprod.com
bugbear - 14 Jan 2008 12:50 GMT Ah, you again.
If you tell us what you're trying to achieve, we might be able to help you. It appears that you're trying to micro-optimise (tactics) a program, where a superior algorithm (strategy) will probably give better results.
BugBear
Hal Rosser - 15 Jan 2008 04:21 GMT >I have a piece of code where i need to search from an Array list > [quoted text clipped - 21 lines] > fastest way to replace this condition statement. > if (array1[hh]=='K'). Take a look at the Arrays class. Arrays.sort(arrayName) then Arrays.binarySearch(arrayName, value)
Lasse Reichstein Nielsen - 15 Jan 2008 19:42 GMT > I have a piece of code where i need to search from an Array list ...
> char[] array1= new char[100]; > [quoted text clipped - 3 lines] > if (array1[hh]=='K') ff+=function2(hh); > } Curious example. You are doing 100000 iterations where each iteration picks an index between 0 and 99. That means that you repeat the operation depending on hh an average of 1000 times for each value of hh. If you know something about function1, it might be possible to do some loop invariant computation. However, in this case, it would probably mean that you update an array of counters 100000 times instead of reading an array of chars as many times. It would be more expensive, if it's the array lookup that costs, and not the computation of function2.
> Here I found most of the time is spent in evaluating the if condition > > if (array1[hh]=='K'). Then you are lucky. Your functions are pretty quickly evaluated if they are faster than a single array lookup. My bet is it won't get any faster.
> How can this char array be searched faster than finding hh position > equal to 'K' or not. What do you mean by "searching". If the information you need is whether the hh'th position equals 'K', then there is no faster way than checking it.
Any optimization would mean creating a lookup table, and that would also be an array. Your only chance is to use knowledge of function1 (or perhaps function2) to avoid doing some of the loop iterations.
> Can I use String / BufferString to increase the speed of above > evaluation? No, they are based on char arrays too.
> Will using Vector / List improve or can I use Set. Unlikely, they also use arrays internally (except LinkedList, which is definitly not faster).
> What will be the > fastest way to replace this condition statement. > if (array1[hh]=='K'). The time taken here is almost exclusively in the bounds checking of the array access. Comparing to a constant is trivial. The only chance to improve on this is to do less than one array lookup per iteration. Again, that only works if you know and can exploit some specific properties of the functions or the array contents.
If you know the array at compile time, you might be able to create a 100-way switch statement that is equivalent, or some other ingenious design, but that would not generalize very well.
There is no faster, general, way to check whether an array index holds a specific value dynamically.
All containers that allow constant time lookup are using arrays internally (it's the only construction in pure Java that can have an arbitrary size which is determined at runtime). Using them will require at least one array lookup anyway.
/L
 Signature Lasse Reichstein Nielsen - lrn@hotpop.com DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html> 'Faith without judgement merely degrades the spirit divine.'
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|