Java Forum / General / March 2008
Java HashSet Problem
twizansk@gmail.com - 14 Mar 2008 00:53 GMT Hello,
I am having a problem using a hashset for user defined objects. Identical objects are not being recognized as identical.
Here is the relevant code:
import java.util.*;
class MyString {
MyString(String s, int i) { this.s=s; this.i=i; }
private String s; private int i;
public boolean equals(MyString mystring) {
return (s.equals(mystring.s) && i==mystring.i);
}
public int hashCode() {
int result=1; result=37*result + i; result=37*result + s.hashCode();
return result;
}
}
public class Temp {
public static void main(String[] args) {
MyString key1 = new MyString("ten",10); MyString key2 = new MyString("ten",10);
HashSet<MyString> hset = new HashSet<MyString>(); hset.add(key1); System.out.println("hashset contains key1: " + hset.contains(key1)); System.out.println("hashset contains key2: " + hset.contains(key2)); System.out.println("key1.equals(key2): " + key1.equals(key2)); System.out.println("key2.equals(key1): " + key2.equals(key1)); System.out.println("key1 hashcode: " + key1.hashCode()); System.out.println("key2 hashcode: " + key2.hashCode());
}
}
The output is:
hashset contains key1: true hashset contains key2: false key1.equals(key2): true key2.equals(key1): true key1 hashcode: 116456 key2 hashcode: 116456
Apparently, the hashset doesn't realize that key1 and key2 are equal even though the hash codes are equal and the equals method is implemented correctly.
Does anyone know what I'm doing wrong?
Thanks
Andrew Thompson - 14 Mar 2008 01:06 GMT On Mar 14, 10:53 am, twiza...@gmail.com wrote: ...
> I am having a problem using a hashset for user defined objects. > Identical objects are not being recognized as identical. Huh? Your output suggests otherwise.*
> Here is the relevant code: As an aside. Good example code. Compiled and ran (producing identical results here) without changes.
> The output is: > > hashset contains key1: true > hashset contains key2: false > key1.equals(key2): true > key2.equals(key1): true * Here it is saying that it recognises that key1 and key2 *are* the same according to the specified test of equality.
> Does anyone know what I'm doing wrong? I do not understand your question. Did I miss something?
-- Andrew T. PhySci.org
twizansk@gmail.com - 14 Mar 2008 01:19 GMT > On Mar 14, 10:53 am, twiza...@gmail.com wrote: > ... [quoted text clipped - 29 lines] > Andrew T. > PhySci.org Sorry. I'll clarify the question. key1 and key2 are recognized as equal using the equals method and they do generate the same hash code. Therefore, when I insert key1 into hset and ask if the set contains key2 (using the contains() method) the answer should be yes. However, as you can see, the hash set does not realize that it contains key2.
I hope that clarifies the problem I'm having.
Thanks
Arved Sandstrom - 14 Mar 2008 02:02 GMT [ SNIP ]
> Sorry. I'll clarify the question. key1 and key2 are recognized as > equal using the equals method and they do generate the same hash [quoted text clipped - 6 lines] > > Thanks Except, where did you add key2? :-)
AHS
 Signature * change 'two' to '2' to email me
Lew - 14 Mar 2008 02:08 GMT twizansk@gmail.com wrote:
> import java.util.*; > [quoted text clipped - 29 lines] >> However, as you can see, the hash set does not realize that it >> contains key2. You have to override <http://java.sun.com/javase/6/docs/api/java/lang/Object.html#equals(java.lang.Object)> not overload it.
> Except, where did you add key2? :-) That was the whole point - he didn't. The collection was supposed to compare physically distinct objects as logically equals(). It should have returned that key2 was contained. Except for the failure to override equals().
 Signature Lew
Andrew Thompson - 14 Mar 2008 04:44 GMT ...
> ...The collection was supposed to compare > physically distinct objects as logically equals(). It should have returned > that key2 was contained. Except for the failure to override equals(). It took me a while to understand your Zen-like answer.
Now having solved it, and now with some enthusiasm, I intend to reduce your beautiful answer to something far more 'common'.
public boolean equals(Object mystring) {
if ( mystring instanceof MyString ) { MyString astring = (MyString)mystring; return (s.equals(astring.s) && i==astring.i); } else { return false; } }
-- Andrew (just call me 'Grasshopper') T.
Andrew Thompson - 14 Mar 2008 04:47 GMT > <http://java.sun.com/javase/6/docs/api/java/lang/Object.html#equals(ja...)> > not overload it. Dang! How did I miss that first sentence?!
-- Andrew (harumph) T.
Andrew Thompson - 14 Mar 2008 04:52 GMT > > <http://java.sun.com/javase/6/docs/api/java/lang/Object.html#equals(ja...)> > > not overload it. > > Dang! How did I miss that first sentence?! ..not to mention Eric's even more direct answer!
I'm going to retreat to a quiet place for a (short) while and nurse my wounded pride. :-(
-- Andrew (what-the-hell-was-I-thinking) T.
Lew - 14 Mar 2008 05:12 GMT > I'm going to retreat to a quiet place for > a (short) while and nurse my wounded pride. :-( Actually, your pride should be strengthened. After all, you did figure it out on the earliest hints, before resorting to the easy answers.
 Signature Lew
Lasse Reichstein Nielsen - 14 Mar 2008 07:09 GMT >> <http://java.sun.com/javase/6/docs/api/java/lang/Object.html#equals(ja...)> >> not overload it. > > Dang! How did I miss that first sentence?! Don't feel too bad. Most people make that exact mistake once or twice
:) /L
 Signature Lasse Reichstein Nielsen - lrn@hotpop.com DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html> 'Faith without judgement merely degrades the spirit divine.'
Roedy Green - 14 Mar 2008 03:00 GMT >Sorry. I'll clarify the question. key1 and key2 are recognized as >equal using the equals method and they do generate the same hash >code. Therefore, when I insert key1 into hset and ask if the set >contains key2 (using the contains() method) the answer should be yes. >However, as you can see, the hash set does not realize that it >contains key2. "List the output and the output you expected. In a certain sense, your program nearly always works perfectly. It is just that you misunderstood what that piece of code was supposed to produce. It usually produces a result completely in conformance with what you asked for; it is just not the result you wanted." ~ http://mindprod.com/jgloss/sscce.html
--
Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Eric Sosman - 14 Mar 2008 02:09 GMT > Hello, > [quoted text clipped - 3 lines] > public boolean equals(MyString mystring) { > [...] HashSet is going to call the equals(Object) method, not the equals(MyString) method. Since you have not overridden equals(Object), you wind up using the version from Object itself, which declares any two objects different unless they are in fact the same instance.
 Signature Eric Sosman esosman@ieee-dot-org.invalid
Roedy Green - 14 Mar 2008 02:51 GMT > int result=1; > result=37*result + i; > result=37*result + s.hashCode(); > > return result; why did you write that in such a meandering way? You could have got the same essential effect with
return i * 37 + s.hashCode(); or return i ^ s.hashCode(); or even return i + s.hashCode();
--
Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Peter Duniho - 14 Mar 2008 03:04 GMT >> int result=1; >> result=37*result + i; [quoted text clipped - 4 lines] > why did you write that in such a meandering way? You could have got > the same essential effect with I'm not the OP, but the code was probably written that way because it more clearly states the algorithm being used, can be more easily modified if the members used to calculate the hash code change (just copy-and-paste lines for new members or delete removed ones), and because any decent compiler can easily turn it into practically the equivalent of the most-correct alternative you posted (there's an extra add, but that's no big deal).
In other words, the main thing the examples you posted accomplish is obfuscating the calculation. I realize that's a goal for some programmers, but many of us prefer code that's more maintainable.
Pete
Roedy Green - 14 Mar 2008 06:49 GMT On Thu, 13 Mar 2008 19:04:39 -0700, "Peter Duniho" <NpOeStPeAdM@nnowslpianmk.com> wrote, quoted or indirectly quoted someone who said :
>In other words, the main thing the examples you posted accomplish is >obfuscating the calculation Don't be silly. My examples are much clearer.
 Signature
Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Peter Duniho - 14 Mar 2008 07:18 GMT >> In other words, the main thing the examples you posted accomplish is >> obfuscating the calculation > > Don't be silly. My examples are much clearer. I didn't expect you to agree with me; your reputation precedes you. Suffice to say though, your own opinion regarding readability and clarity is not the last word on the matter.
Pete
Roedy Green - 14 Mar 2008 13:02 GMT On Thu, 13 Mar 2008 23:18:24 -0700, "Peter Duniho" <NpOeStPeAdM@nnowslpianmk.com> wrote, quoted or indirectly quoted someone who said :
>I didn't expect you to agree with me; your reputation precedes you. >Suffice to say though, your own opinion regarding readability and clarity >is not the last word on the matter. The general rule is brevity adds to clarity. Adding code that does not do anything is just one more thing you have to needlessly figure out.
Agreed that a regular pattern makes it easy to extend, but that code had needless fillips that disguised the regularity.
 Signature
Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Roedy Green - 14 Mar 2008 03:39 GMT HashSet.contains is defined as Returns true if this set contains the specified element. More formally, returns true if and only if this set contains an element e such that (o==null ? e==null : o.equals(e)).
so you DON'T need an == match to count as contains.
I think you have found a bug.
Have a peek at the code in src.zip to see if you can see if they are doing an == instead of an equals for contains. --
Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Lew - 14 Mar 2008 05:09 GMT > HashSet.contains is defined as > Returns true if this set contains the specified element. More [quoted text clipped - 4 lines] > > I think you have found a bug. Yes, in the OP's code.
> Have a peek at the code in src.zip to see if you can see if they are > doing an == instead of an equals for contains. They're using equals( Object ), not equals( MyString ), for contains().
 Signature Lew
Roedy Green - 14 Mar 2008 07:03 GMT >They're using equals( Object ), not equals( MyString ), for contains(). Ah ha!
He should have overridden equals( Object o ) not defined a new method equals ( MyString o ) because HashSet.contains will use the == default implemenation of equals ( Object o ).
see http://mindprod.com/jgloss/equals.html
 Signature
Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Gordon Beaton - 14 Mar 2008 07:50 GMT >>They're using equals( Object ), not equals( MyString ), for contains(). > > Ah ha! Indeed. Perhaps if you actually *read* the thread before posting your reply, you would have seen that the correct answer had already been posted twice by others.
/gordon
--
Roedy Green - 14 Mar 2008 13:05 GMT >Indeed. Perhaps if you actually *read* the thread before posting your >reply, you would have seen that the correct answer had already been >posted twice by others. Guilty as charged. In general I don't read all the threads, mainly because so few of the posts have anything to do with the topic at hand. Most are about one-up-manship games, as was your post.
 Signature
Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
twizansk@gmail.com - 14 Mar 2008 19:02 GMT On Mar 14, 5:05 am, Roedy Green <see_webs...@mindprod.com.invalid> wrote:
> >Indeed. Perhaps if you actually *read* the thread before posting your > >reply, you would have seen that the correct answer had already been [quoted text clipped - 7 lines] > Roedy Green Canadian Mind Products > The Java Glossaryhttp://mindprod.com Thanks everyone. I overrode equals() instead of overloading it and now it works perfectly. Stupid mistake o my part but I guess you learn something new every day.
Andrew Thompson - 15 Mar 2008 00:06 GMT On Mar 15, 5:02 am, twiza...@gmail.com wrote: ..
> ...I overrode equals() instead of overloading it and > now it works perfectly. Stupid mistake o my part but I guess you > learn something new every day. That's part of what what makes them (days) fun. :-)
-- Andrew T.
pjvleeuwen@gmail.com - 20 Mar 2008 15:49 GMT Let's digg up this thread because to me it seems the JavaDoc on HashSet#contains(Object o) is not right.
It all started with these lines in a unit-test (and a failure on the last line): Person bob = new Person("Bob", 'm', new Date(1, 7, 2007), "1"); Set<Person> store = new HashSet<Person>(); store.add(bob); assertTrue(bob.equals(new Person( // "Bob", 'm', new Date(1, 7, 2007), "1"))); assertTrue(store.contains(bob)); assertTrue(store.contains(new Person( // "Bob", 'm', new Date(1, 7, 2007), "1")));
Person overwrites equals(Object) in a correct way, but does not overwrite the hashCode() method. When I look at the JavaDoc of HashSet#contains(Object o) I get: " Returns true if this set contains the specified element. More formally, returns true if and only if this set contains an element e such that (o==null ? e==null : o.equals(e)). "
But! when we check the source of the HashSet#contains(Object ourPerson) we see it uses HashMap.containsKey(Object ourPerson) which uses HashMap.getEntry(Object ourPerson). now getEntry's source looks like: " int hash = (key == null) ? 0 : hash(key.hashCode()); for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } return null; "
If I am reading this correctly, then the equals method is only evaluated when the hashes of the two objects are equals. Since I did not overwrite Object#hashCode() HashSet#contains(Object) returns false in the above scenario.
Since I couldn't find any other post on this matter (or bug-report) I want to ask you guys to check my above statements.
Ow, for the record the HashSet class I have is labeled: /* * @(#)HashSet.java 1.37 06/04/21 * * Copyright 2006 Sun Microsystems, Inc. All rights reserved. * SUN PROPRIETARY/CONFIDENTIAL. Use is subject to license terms. */ and my java lives In C:\jdk1.6.0_03 (so that would probably be the version then)
Cheers, Paul
Patricia Shanahan - 20 Mar 2008 16:30 GMT ...
> Person overwrites equals(Object) in a correct way, but does not > overwrite the hashCode() method. Almost invariably, equals cannot be overridden "in a correct way" without also overriding hashCode().
...
> If I am reading this correctly, then the equals method is only > evaluated when the hashes of the two objects are equals. Since I did > not overwrite Object#hashCode() HashSet#contains(Object) returns false > in the above scenario. ...
Correct, but to turn this into a problem in HashSet, rather than in yoru class, you need to ignore a critical piece of documentation.
HashSet, like all Java classes, is limited to operating on objects whose class extends, directly or indirectly, java.lang.Object. Therefore, its implementation can assume that the objects in the set follow the rules stated in the java.lang.Object documentation.
From the documentation for equals: "Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.". This makes overriding equals without maintaining the hashCode contract incorrect.
From the documentation for hashCode: "If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result."
The Java APIs are full of method implementations that depend on methods in other classes behaving according to some aspect of the documented contract for one of the class's superclasses or interfaces. Generally, those dependencies are only documented through the type of the reference. Otherwise, the documentation would be impossibly cluttered with some phrase like "... provided parameter x of type X conforms to the X contract."
Requiring an Object reference means that the Object contract applies, not that there is no contract.
Patricia
pjvleeuwen@gmail.com - 23 Mar 2008 21:39 GMT Patricia, thanks for your comments on this.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|