Java Forum / General / November 2005
HashMap with primitive int key
Andrew E - 28 Nov 2005 20:48 GMT Dear All,
I have a class like this:
class CustomerOrder { int globalID; ... }
I have an array of these. I'd like to store an index of globalIDs, so I can do:
CustomerOrder o = index.get(123)
I could modify my class to be:
class CustomerOrder { Integer globalID; ... }
Perhaps it is my C++ background, but the idea of allocating an object for every ID when I can just use an int bothers me.
I figure a HashMap is the right tool of choice, e.g.:
HashMap<Integer, CustomerOrder> index = new ...
Has anyone out there already written a modified hashmap class that uses int as the key?
Thanks for any tips :)
Andrew
Oliver Wong - 28 Nov 2005 20:58 GMT > Dear All, > [quoted text clipped - 30 lines] > > Thanks for any tips :) If you use Integer.valueOf(int), it'll re-use instances of Integer when possible rather than creating a new instance every time with the constructor. See the Flyweight design pattern.
- Oliver
Chris Smith - 28 Nov 2005 21:02 GMT > I could modify my class to be: > [quoted text clipped - 5 lines] > Perhaps it is my C++ background, but the idea of allocating an object for every > ID when I can just use an int bothers me. It's possible to write a custom hashing class that uses int as the key type... but it's probably not a good idea. Instead, just eat the cost of the new objects. Short-lived object allocation has very different performance characteristics in C++ and Java, so your intuition may be off here.
In any case, it's easy enough to fix if you find out it's a performance problem.
 Signature www.designacourse.com The Easiest Way To Train Anyone... Anywhere.
Chris Smith - Lead Software Developer/Technical Trainer MindIQ Corporation
Andrew E - 29 Nov 2005 12:07 GMT >>I could modify my class to be: >> [quoted text clipped - 11 lines] > performance characteristics in C++ and Java, so your intuition may be > off here. Thanks for your reply Chris, and thanks to Oliver and zero too.
Its interesting that you wrote there are different performance characteristics for short-lived objects between C++ and Java.
Can you expand on that, or suggest some references to read? Perhaps my C++ background is hampering me here, as I look at 'new Integer(1)' etc and cringe. I'd like to learn more in this area. Any tips?
Thanks :) Andrew
Chris Smith - 29 Nov 2005 17:23 GMT > Its interesting that you wrote there are different performance characteristics > for short-lived objects between C++ and Java. > > Can you expand on that, or suggest some references to read? Perhaps my C++ > background is hampering me here, as I look at 'new Integer(1)' etc and cringe. > I'd like to learn more in this area. Sure.
Obviously, Java and C++ differ in that Java is designed as a garbage collected language, while C++ is designed with a manually managed heap. It's a common, but very wrong, assumption that garbage collection is something that's ADDED in Java, and that otherwise memory allocation looks about the same. Neither language specifies the management of the heap, but in fact, the heap looks very different in typical Java than in typical C++.
C++ keeps a linked list of available memory for a process, and allocating an object means starting somewhere in that list and walking through the list to look for some available bit of memory. This is, worst case, actually O(n) on the total number blocks of memory currently in use by the system -- though clearly it's not quite that bad in common practice. In return for the terrible asymptotic complexity of memory allocation, deallocation in C++ is cheap, running in constant time.
(I'm not considering calling destructors, which obviously could run in non-constant time, to be part of deallocation, nor do I consider constructors as part of allocation.)
Java doesn't maintain a linked list, at least for newly allocated objects. Instead, it just has a block of memory and keeps lopping off more objects until it runs out. That block of memory is called the nursery. So allocation is constant time, and with an extremely low constant at that! Andrew Appel famously demonstrated that on processors with auto-increment addressing modes, allocation can actually be performed by dedicated CPU hardware with no performance cost whatsoever, so it may actually be said to be O(0), or free. By contrast, deallocation is more expensive; it is O(m+n), where m is the size of the root set -- often considered to be constant -- and n is the number of objects that will NOT be deallocated. Hence, the only cost of allocating a small, short-lived object is that the next minor deallocation (garbage collection) will need to happen a little sooner.
Now, clearly that allocation is not free... more frequent garbage collections do add up. However, the impact is certainly far less than frequently allocating and freeing small objects in C++.
 Signature www.designacourse.com The Easiest Way To Train Anyone... Anywhere.
Chris Smith - Lead Software Developer/Technical Trainer MindIQ Corporation
Thomas Hawtin - 29 Nov 2005 19:38 GMT > C++ keeps a linked list of available memory for a process, and > allocating an object means starting somewhere in that list and walking [quoted text clipped - 3 lines] > practice. In return for the terrible asymptotic complexity of memory > allocation, deallocation in C++ is cheap, running in constant time. I think C/C++ tends to be better implemented than that.
I remember Acorn (ARM) C malloc having a description of the algorithm used. For small sizes it maintained an array of linked lists. A separate list for each allocation size (rounded to a certain power of two). Small allocations consisted of: checking value is size is small enough, rounding to the fixed power of two, loading the table address, checking that there is a node in that list and switching to the next link. That should be about as fast as the modern Java implementation, and have much better cache locality. Obviously preemptive multi-threading would have made it more difficult.
Tom Hawtin
 Signature Unemployed English Java programmer http://jroller.com/page/tackline/
Thomas Hawtin - 29 Nov 2005 18:13 GMT > Its interesting that you wrote there are different performance characteristics > for short-lived objects between C++ and Java. > > Can you expand on that, or suggest some references to read? "Java theory and practice: Urban performance legends, revisited" is a recent, widely linked article that comes top of my google search for Java allocation.
http://www-128.ibm.com/developerworks/java/library/j-jtp09275.html?ca=dgr-lnxw01 JavaUrbanLegends
In Sun's JVM at least: Each thread has Thread Local Allocation Buffer (TLAB). A Processor Local Allocation Buffer (PLAB) is also an effective alternative, using some cunning tricks. Small objects are allocated within the thread's TLAB. The fast-path for allocating these objects is effectively just bumping a pointer (with some complications for interactions with the garbage collector). When the TLAB is exhausted, a new area is allocated. Periodically the GC will copy out any live objects within the old area. The area now containing copied and dead objects can just be used a fresh area of memory. It's not necessary to so much as look at dead objects.
The details are quite complicated. For instance, working out when objects from other areas reference newly created objects. Still, where you have lots of short lived objects, the performance easily beats malloc-style algorithms.
> Perhaps my C++ > background is hampering me here, as I look at 'new Integer(1)' etc and cringe. > I'd like to learn more in this area. That should read 'Integer.valueOf(1)'.
Tom Hawtin
 Signature Unemployed English Java programmer http://jroller.com/page/tackline/
Roedy Green - 30 Nov 2005 00:36 GMT On Tue, 29 Nov 2005 18:13:01 +0000, Thomas Hawtin <usenet@tackline.plus.com> wrote, quoted or indirectly quoted someone who said :
>A Processor Local Allocation Buffer (PLAB) is also an effective >alternative, using some cunning tricks. then there is one more trick to allocate objects on the stack so they automatically get popped off when the method ends without using garbage collection.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
zero - 29 Nov 2005 10:22 GMT > Dear All, > [quoted text clipped - 30 lines] > > Andrew In JDK 1.5 your original code would work fine because Java creates an Integer object automatically. Of course this doesn't really change anything, it just hides the object from you.
Obviously, a primitive int does not have a hash code - unless you define the int itself as its own hash code. It wouldn't be too hard to re- implement HashMap to use a raw int as key, but this would only be a good idea if you can be sure you won't have too many collisions. And even then, I think it wouldn't be worth the effort.
 Signature Beware the False Authority Syndrome
Ian Pilcher - 29 Nov 2005 14:53 GMT > Obviously, a primitive int does not have a hash code - unless you define > the int itself as its own hash code. It wouldn't be too hard to re- > implement HashMap to use a raw int as key, but this would only be a good > idea if you can be sure you won't have too many collisions. And even > then, I think it wouldn't be worth the effort. Have you looked at what Integer.hashCode() returns?
 Signature ======================================================================== Ian Pilcher i.pilcher@comcast.net ========================================================================
zero - 29 Nov 2005 17:40 GMT >> Obviously, a primitive int does not have a hash code - unless you >> define the int itself as its own hash code. It wouldn't be too hard [quoted text clipped - 3 lines] > > Have you looked at what Integer.hashCode() returns? lol no I haven't, and I'm quite sure it's just the integer itself. But that doesn't change the fact that a raw int doesn't have a hash code.
 Signature Beware the False Authority Syndrome
Roedy Green - 30 Nov 2005 00:38 GMT >lol no I haven't, and I'm quite sure it's just the integer itself. But >that doesn't change the fact that a raw int doesn't have a hash code. Integer in contrast does have an hashCode, the integer value itself
public int hashCode() { return value;
Long xors the low and high halves:
public int hashCode() { return (int)(value ^ (value >>> 32)); }
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|