Java Forum / General / September 2006
Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write?
Danno - 15 Sep 2006 01:38 GMT Just curious
Lionel - 15 Sep 2006 01:40 GMT > Just curious Not even close :(.
Matt Humphrey - 15 Sep 2006 02:02 GMT > Just curious Yes, for exactly every class that needs a custom implementation. A class-specific meaning is not necessarily needed for every object.
Matt Humphrey matth@ivizNOSPAM.com http://www.iviz.com/
Arne Vajhøj - 15 Sep 2006 02:07 GMT > Just curious toString: often
equals and hashCode: sometimes
Arne
Manish Pandit - 15 Sep 2006 02:20 GMT I use toString for the transfer objects, and put the Apache Commons ToStringBuilder.reflectionToString(this) in the overriden method. However, this is very expensive operation and I control this via log4j levels, like logger.debug(myObject) instead of going lose cannon system.out.println :-)
I use equals rarely, when I have to compare DTOs based on a set of attributes - for example, comparing 2 persons based on their firstNames being same, ignoring other attributes.
hashCode - never.
-cheers, Manish
> > Just curious > [quoted text clipped - 3 lines] > > Arne Arne Vajhøj - 15 Sep 2006 02:57 GMT > I use equals rarely, when I have to compare DTOs based on a set of > attributes - for example, comparing 2 persons based on their firstNames > being same, ignoring other attributes. > > hashCode - never. You know what you should do ...
Arne
Manish Pandit - 15 Sep 2006 03:33 GMT > You know what you should do ... eh?
Arne Vajhøj - 15 Sep 2006 03:44 GMT >> You know what you should do ... > > eh? If you implement equals, then you should implement hashCode !
Arne
Manish Pandit - 15 Sep 2006 04:04 GMT Ah :)
In conjunction with equals() - yes, standalone - never.
-cheers, Manish
> >> You know what you should do ... > > [quoted text clipped - 3 lines] > > Arne Mark Jeffcoat - 15 Sep 2006 04:07 GMT > Just curious You'll know when you need equals()... most things don't get compared, so there's usually no reason to bother.
Once you've put in your own equals() method, you really need to override hashCode(), too. But pre-mature optimization is evil, so here's the one I always use pre-profiling:
public int hashCode() { return 0; }
I'm not kidding.
 Signature Mark Jeffcoat Austin, TX
Daniel Dyer - 15 Sep 2006 10:35 GMT >> Just curious > [quoted text clipped - 9 lines] > return 0; > } Personally, I'd never implement hashCode this way because the effort to do it properly is so minimal. Premature optimisation may be evil but, if there are two ways to do something, choosing the better way is not necessarily premature optimisation - it could just be doing things right first time around.
If you are going to be putting these objects into a hashed collection (maps and sets), you're effectively turning that collection into a list with this hashCode implementation, and I'm assuming that you're not using lists everywhere until the profiler tells you that you need a better data structure.
You can generally write a reasonable hashCode implementation in a few lines following the approach that Josh Bloch outlines in Effective Java. IDEs such as IDEA can automatically generate matching equals and hashCode methods for you. Since this is a general purpose approach and trivial to do I don't see it as premature optimisation. If you were talking about writing a hashCode implementation that is optimised to the specific usage patterns of your application, then I would agree that that would probably qualify as optimisation that is best left until there is a demonstrated need.
Dan.
 Signature Daniel Dyer http://www.dandyer.co.uk
Chris Uppal - 15 Sep 2006 13:42 GMT > Once you've put in your own equals() method, you really > need to override hashCode(), too. But pre-mature optimization [quoted text clipped - 5 lines] > > I'm not kidding. You should be.
/Anything/ would be better than that. Throw a runtime exception -- at least that would guarantee that you weren't depending on having a working hashCode() without realising it.
-- chris
Patricia Shanahan - 15 Sep 2006 14:18 GMT >> Once you've put in your own equals() method, you really >> need to override hashCode(), too. But pre-mature optimization [quoted text clipped - 11 lines] > that would guarantee that you weren't depending on having a working hashCode() > without realising it. It is a functionally correct hashCode. If a and b reference equal instances of the class, a.hashCode() is equal to b.hashCode(). If there is any problem with it, it will show as poor hash structure performance.
I assume that if Mark found poor hash table performance during profiling, he would upgrade the hashCode() method for the key class.
That said, when I'm writing an equals(), I know exactly which features must be equal for two instances to be considered equal, which is the only difficult part of writing a good hashCode(). I just write the two methods together.
Patricia
Matt Humphrey - 15 Sep 2006 14:53 GMT <snip prelude>
> I assume that if Mark found poor hash table performance during > profiling, he would upgrade the hashCode() method for the key class. [quoted text clipped - 3 lines] > only difficult part of writing a good hashCode(). I just write the two > methods together. I agree with this approach in that objects for which content equality is meaningful (which must be implemented and therefore understood) have an easily implementable hashCode.
My question is, what is a good first implementation of compound hashCode? For objects that have simple fields a, b, c I tend to use
a.hashCode ^ b.hashCode ^ c.hashCode
And for compositions, accumulating hashCodes of elements over ^.
The idea is to get the greatest possible spread while ensuring that two objects that are equal will produce the same value.
Are there better functions to use? What are your rules of thumb?
Matt Humphrey matth@ivizNOSPAM.com http://www.iviz.com/
Thomas Hawtin - 15 Sep 2006 15:17 GMT > My question is, what is a good first implementation of compound hashCode? > For objects that have simple fields a, b, c I tend to use > > a.hashCode ^ b.hashCode ^ c.hashCode That doesn't work well if the hash codes of a, b and c are often equal, a, b and c are often permutations, or whatever.
You could take you lead from String and do something like:
@Override public int hashCode() { int h = a.hashCode(); h = h*31 + b.hashCode(); h = h*31 + c.hashCode(); return h; }
Tom Hawtin
 Signature Unemployed English Java programmer http://jroller.com/page/tackline/
Lord0 - 15 Sep 2006 15:43 GMT toString(): Often, especially for DTO's etc
equals(), hashcode(): Seldom. I may override equals(), again probs for DTO's. If I override equals() then I will always override hashcode()
Lord0
Patricia Shanahan - 15 Sep 2006 16:44 GMT >> My question is, what is a good first implementation of compound hashCode? >> For objects that have simple fields a, b, c I tend to use [quoted text clipped - 13 lines] > return h; > } That is what I usually do.
Patricia
Chris Uppal - 15 Sep 2006 17:19 GMT > My question is, what is a good first implementation of compound hashCode? > For objects that have simple fields a, b, c I tend to use > > a.hashCode ^ b.hashCode ^ c.hashCode > > And for compositions, accumulating hashCodes of elements over ^. Multiplication isn't such a good idea if any of the elements has a tendency to be zero. Xor is also not such a good idea unless it is combined with bit-shifting to mix the bits up (and also to make the hash sensitive to the order of elements in the collection -- which is normally, but not always, what you want).
A couple of approaches:
(a * P + b) * P + c (where P is some prime) ((a << S) ^ b) << S) ^ c (where S is some small integer)
Either can be generalised to iterating over collections. But the second should be modified to catch the bits as they are shifted off the high end and xor-ing then back in at the low end, otherwise only the last few elements in the collection will contribute to the hash. For example:
int hash(int[] data) { int hash = 0 for (int i : data) { int wrapBit = hash >>> 31; hash = (hash << 1) ^ i ^ wrapBit; } }
There are lots of possible modifications.
-- chris
Chris Uppal - 15 Sep 2006 16:39 GMT [me:]
> > /Anything/ would be better than that. Throw a runtime exception -- at > > least that would guarantee that you weren't depending on having a > > working hashCode() without realising it. > > It is a functionally correct hashCode. Only in the minimal sense that it abides by its contract. It not, however, useable for anything; therefore it only exists (or should only exist) as a placeholder for a method which the author assumes will not be needed. But that's a major risk -- sooner or later someone else is likely[*] to start using these things in HashTables, and then they get poor performance without knowing why (or perhaps even without realising that they are getting unecessarily poor performance). If the assumption is that the method is only a placeholder, then it would be much safer (as I said) to throw an unchecked exception.
If any programmer /I/ worked with was in the habit of leaving these little timebombs ticking away in the codebase, then I'd be inclined to throw a Very Serious Wobbly.
What you think of a programmer who habitually wrote:
void toString() { return ""; }
? Yet that's a damn sight safer (and no less "reasonable") than always returning zero from hashCode().
> I assume that if Mark found poor hash table performance during > profiling, As an aside, profiling is not a good tool for recognising or diagnosing poor hashing functions. And, although it is a reasonable tool for confirming that a non-too-special hash function is "good enough", even that is somewhat unsafe unless you know that your test data is representative of real world data /in the way it interacts with the hash/.
-- chris
[*] "likely" assuming normal operation of Sod's Law.
Mark Jeffcoat - 15 Sep 2006 19:08 GMT > As an aside, profiling is not a good tool for recognising or diagnosing poor > hashing functions. And, although it is a reasonable tool for confirming that a > non-too-special hash function is "good enough", even that is somewhat unsafe > unless you know that your test data is representative of real world data /in > the way it interacts with the hash/. Since the constant hashing function always shows up in the same way in the profiler (an unusually large number of equals() comparisons, as your HashMap degrades into a list), I think the profiler is a perfectly good tool.
It's also perfectly safe; you'll never get the wrong answer this way.
The real win comes from two places: first, that you haven't wasted any time or mental effort solving a problem that wouldn't have done you any good anyway. Second, when you do have a need for a real hash function, the performance will be so miserable that you can't miss it, and you'll exactly where to concentrate your effort. If you'd started with something that's pertty good, that bright red flag won't ever be there, and you may have a harder time spotting your opportunity.
In my recent experience, I need a real hashCode about 5% of the time. For the other 95%, I'd be more productive just setting $20 bills on fire.
(Oooh: just thought of a third way the constant hash function wins: it's really, really hard to screw it up. God help you if you write a hashCode() that's inconsistent with equals().)
 Signature Mark Jeffcoat Austin, TX
AndrewMcDonagh - 15 Sep 2006 22:45 GMT >> Just curious > [quoted text clipped - 11 lines] > > I'm not kidding. If thats your over ridden version - your application would be better served not doing anything. At least that way you may get some distribution of the objects within a Hash. With your current approach your Hashes would always only contain one bucket full of every instance.
You are in effect causing a performance problem - not optimizing a slow one.
Patricia Shanahan - 15 Sep 2006 22:51 GMT >>> Just curious >> [quoted text clipped - 14 lines] > distribution of the objects within a Hash. With your current approach > your Hashes would always only contain one bucket full of every instance. Distributing the objects within the hash is very, very bad unless you ensure that any pair of equal objects have the same hash code.
> You are in effect causing a performance problem - not optimizing a slow > one. He is not trying to optimize. He is trying to maintain program correctness, and do optimization later, if it is needed.
Patricia
Jeffrey Schwab - 16 Sep 2006 00:00 GMT >>> Just curious >> [quoted text clipped - 12 lines] > If thats your over ridden version - your application would be better > served not doing anything. If you override equals(), you're supposed to override hashCode() as well, to ensure that equal objects have equal hash codes.
> At least that way you may get some > distribution of the objects within a Hash. With your current approach > your Hashes would always only contain one bucket full of every instance. He knows.
> You are in effect causing a performance problem - not optimizing a slow > one. This is pre-profiling. He's effectively letting his hash use just one bucket, trading performance (from near-constant to O(n) complexity) for simplicity. I'm not sure I like it, but I appreciate the logic.
Patricia Shanahan - 15 Sep 2006 05:03 GMT > Just curious Definitely not. I implement toString when there is a short, but interesting, String representation. I stay with the Object equals and hashCode unless the class has a strong natural concept of value equality, which generally involves immutability.
Patricia
Danno - 15 Sep 2006 06:11 GMT > > Just curious > [quoted text clipped - 4 lines] > > Patricia You are more eloquent than I am Patricia, that is exactly what I do.
Chris Uppal - 15 Sep 2006 09:09 GMT [In general it is a bad idea to put the question into the subject line without repeating it in the main body of the text].
> Just curious No, certainly not. That would be a very bad mistake.
It's always worth /considering/ overriding toString(), but rarely actually worthwhile. It would often be an /error/ to override equals()/hashCode().
-- chris
GenxLogic - 15 Sep 2006 10:33 GMT Well, toString() methods is overriden very frequently. but equals() and hasshCode() depends upon the need and use fo the class. 90% i have skipped both functions.
Deepak
> [In general it is a bad idea to put the question into the subject line without > repeating it in the main body of the text]. [quoted text clipped - 7 lines] > > -- chris Tor Iver Wilhelmsen - 15 Sep 2006 18:05 GMT > Just curious toString(): Yes, for evey case when a bean is used in a context where a textual representation is needed.
hashCode()/equals(): Yes, for every bean used in e.g. Hibernate mappings, esp. composite key classes which are mandated by Hibernate to override them.
Richard Wheeldon - 15 Sep 2006 19:03 GMT > Just curious toString() : frequently, usually for debug information and for objects which will be used directly in a GUI (e.g. nodes in trees, entries in lists, comboboxes, etc.)
equals() : Usually only for business data objects, which count for a fairly small number of classes compared to those related to the gui, reports, database access, etc. I always implement equals when implementing Comparable though.
hashCode() : Most of the times when I implement equals().
Fwiw, in my (rather old) copy of Sun's JDK source code I found 455 toString()s, 377 equals()s and 265 hashCode()s out of around 4000 source files.
Richard
richardsosborn@gmail.com - 15 Sep 2006 19:28 GMT Not only that but test harness, junit class, class diagram and javadocs before ANY methods are created.
;) (not really)
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|