Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / December 2006

Tip: Looking for answers? Try searching our database.

HashMaps, hashcodes, equals, and Serialization

Thread view: 
Siam - 30 Dec 2006 14:27 GMT
Hi all,

As part of my application, I have an ArticleManager class, that
maintains a HashMap of Article objects (which have an overloaded equals
method - but not an overloaded hashcode). When my application is
closed, the ArticleManager serializes itself and all the Articles.
Questions:

1) How compulsory is it to overload the hashcode method, having
overloaded the equals method. If I don't, will this mean my HashMap
won't function correctly - i.e. if "Article ar1" exists in the hashmap,
and I call get(ar2) on the hashmap, such that ar1.equals(ar2), would it
not return me ar1, since their hashcodes are difference? Does the
HashMap code ever test for equality itself when putting and getting
objects, or does it simply rely on the contracted relationship between
hashcode() and equals()? How would I go about writing an effective
custom hashcode function?

2) The contract for hashcode, as written in the API, states "This
integer [the hashcode] need not remain consistent from one execution of
an application to another execution of the same application." Will this
cause complications for when my articles and the hashmap are
serialized, and later deserialized on another execution of my
application? In that, surely if the hashcode is used in determining
where in the table a certain Article object is placed, then if the
hashcode for the Article (when deserialized) changes on another
execution of the application, would the hashmap still be able to find
the Article in its previous place? Or does
serialization/deserialization maintain the hashcode of the object?

Cheers :)

Siam
Lew - 30 Dec 2006 15:46 GMT
> Hi all,
>
[quoted text clipped - 6 lines]
> 1) How compulsory is it to overload the hashcode method, having
> overloaded the equals method.

It is very compulsory, except when you have a distinct reason to make them
inconsistent. Even then it's compulsory, you just might have a reason to
disregard the rule.

No "Hash.+" anything will work correctly if hashCode() and equals() are
inconsistent. That is why they have "Hash" in their names!

> 2) The contract for hashcode, as written in the API, states "This
> integer [the hashcode] need not remain consistent from one execution of
> an application to another execution of the same application." Will this
> cause complications for when my articles and the hashmap are
> serialized, and later deserialized on another execution of my
> application?

Your serialization format should not in any way depend on the hash. If it
does, then you have not written your readObject)() and writeObject() methods
correctly.

Read Joshua Bloch's _Effective Java Programming_, which has chapters covering
both these questions in detail.

- Lew
Daniel Pitts - 30 Dec 2006 18:39 GMT
> Hi all,
>
[quoted text clipped - 29 lines]
>
> Siam

It is always best practice that you override equals and hashCode
together, or not at all.

Basically, its a contract.  You, as the implementor of your class, are
required to make hashCode and equals work consistently with eachother.

Anything that you use to calculate the hashCode must be represented in
the equals comparison. Note that it doesn't need to be the other way
around, however.

Hope this helps.
Daniel.
Mike Schilling - 30 Dec 2006 19:05 GMT
?

> 2) The contract for hashcode, as written in the API, states "This
> integer [the hashcode] need not remain consistent from one execution of
[quoted text clipped - 7 lines]
> the Article in its previous place? Or does
> serialization/deserialization maintain the hashcode of the object?

During deserialization, the keys are rehashed and the hash chains rebuilt
from scratch, so the new hash table is correct even though the hash values
have changed.
Eric Sosman - 30 Dec 2006 20:42 GMT
> Hi all,
>
[quoted text clipped - 6 lines]
> 1) How compulsory is it to overload the hashcode method, having
> overloaded the equals method.

    Completely optional.  If you had over*ridden* the equals method
that would otherwise have been inherited, that would have been
different: It would then have been mandatory to over*ride* hashCode
as well.

    Overload = Same method name (or constructor) but different
argument list.

    Override = Same method name and same argument list, but
different implementation.  I'm going to assume this is what you
actually meant.

> If I don't, will this mean my HashMap
> won't function correctly - i.e. if "Article ar1" exists in the hashmap,
> and I call get(ar2) on the hashmap, such that ar1.equals(ar2), would it
> not return me ar1, since their hashcodes are difference?

    There is no telling what might happen.  With high probability
the HashMap will fail to find anything for ar2; there is a very
tiny probability that it might find whatever it would have found
for ar1.  (This happens if, by some outrageous coincidence, ar1
and ar2 have the same hashCode -- and since the JVM actively tries
to prevent this from happening, chances are against you.)

> Does the
> HashMap code ever test for equality itself when putting and getting
> objects, or does it simply rely on the contracted relationship between
> hashcode() and equals()?

    Strictly speaking, you're not supposed to know or care how the
internals of HashMap work; that's part of Java's raison d'etre.  All
you're supposed to care about is that if you stick to your side of
the bargain, HashMap will fulfill its obligations.  Part of your
bargain with HashMap is that any pair of items for which equals is
true must have identical hashCode values: If you don't ensure that
this is so, you have not kept your side of the bargain and HashMap
is not obliged to behave as desired.

    In actuality, HashMap first uses the key's hashCode to locate
a "bucket" -- a linked list internal to the HashMap -- where the
key/value pair resides if it's in the map at all.  Then it walks
through the list looking for an existing entry with a key that has
the same hashCode as the one you're looking for and for which
equals returns true.  If it finds such a key, hooray!  Otherwise,
the HashMap concludes that the key isn't in the map.  And that's
why hashCode and equals must agree: if two equals objects have
different hashCodes, HashMap's search will come up empty.

> How would I go about writing an effective
> custom hashcode function?

    Compute something (repeatably) from some subset of the things
equals checks.  Here is a valid but not very good version:

    public int hashCode() { return 42; }

... where the "subset" is empty.  Note that this fulfills the bargain
with HashMap: any two equal objects have equal hashCodes, because all
objects have the same hashCode.  With this hashCode, HashMap will
work correctly -- slowly, but correctly.

    A better hashCode tries to "spread out" the computed values to
decrease the chance that a pair of unequal objects will have the
same hashCode.  Start by considering the tests that equals does:
the items equals checks are all candidates for inclusion in the
hashCode computation.  So, for example,

    class ArticleManager {
       private int xpos;
       private int ypos;
       private String text;
       // ...
       public boolean equals(Object obj) {
           if (! (obj instanceof ArticleManager))
               return false;
           ArticleManager that = (ArticleManager)obj;
           return this.xpos == that.xpos
               && this.ypos == that.ypos
               && this.text.equals(that.text);
       }
       public int hashCode() {
           return text.hashCode();
       }
    }

would work.  Things will usually work a little better if you
incorporate more of the equals-relevant fields and "stir" them
unsystematically into the mix:

       public int hashCode() {
           int hash = 9882345;        // arbitrary value
           hash = hash * 509 + xpos;  // primes are traditional
           hash = hash * 503 + ypos;
           hash = hash * 499 + text.hashCode();
           return hash;
       }

> 2) The contract for hashcode, as written in the API, states "This
> integer [the hashcode] need not remain consistent from one execution of
[quoted text clipped - 7 lines]
> the Article in its previous place? Or does
> serialization/deserialization maintain the hashcode of the object?

    If the hashCode computation is "expensive," you might arrange
to compute it just once and store the result in a private field of
the object, either when the object is constructed or the first time
hashCode is called.  If you serialize and deserialize that field
there could be trouble, because the contributing hashCodes of some
of the referenced objects might be different and the deserialized
value might not match what you'd get if you computed it anew.  But
if you compute the hashCode fresh each time, or if you arrange to
recompute it upon deserialization, all should be well.

    As for the HashMap, the folks who wrote it were aware that the
hashCodes of the keys it holds might change on deserialization, so
they wrote HashMap's own deserialization code to allow for that
fact.  HashMap essentially "reloads" itself on deserialization,
storing each key/value pair according to the "new" hashCode values.

Signature

Eric Sosman
esosman@acm-dot-org.invalid



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.