Hi,
Is a hashvalue for a string using the function String.hashCode() always
unique.
John
Lew - 02 Jun 2007 16:02 GMT
> Is a hashvalue for a string using the function String.hashCode() always
> unique.
If by "unique" you mean that any distinct String has a distinct hashCode(), it
is not guaranteed.
Have you read the Javadocs for hashCode()? The API docs are very often a
perfect first place to find such answers.
<http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode()>
> It is /not/ required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results.
You can examine the particular algorithm for String at
<http://java.sun.com/javase/6/docs/api/java/lang/String.html#hashCode()>
Even without Javadocs it is easy to prove that the hash codes for Strings
cannot be unique for all possible String values. There are only 2**32 - 1
possible hashCode() results. Any String longer than two characters has a
larger value set than that. The sum of the value set counts for Strings of
any length is larger still.
Gotta love those Javadocs.

Signature
Lew
Daniel Dyer - 02 Jun 2007 16:05 GMT
> Hi,
>
> Is a hashvalue for a string using the function String.hashCode() always
> unique.
>
> John
No. Consider that hasCode returns an int. An int is a 32-bit value, so
there are only 2^32 (approx. 4.3 billion) possible values. There number
of possible Strings is, for all practical purposes, unlimited. So it
would not be possible to write a hashCode implementation that generates a
different value for each different String.
Dan.

Signature
Daniel Dyer
http//www.uncommons.org
Lew - 02 Jun 2007 16:05 GMT
> Is a hashvalue for a string using the function String.hashCode() always
> unique.
If by "unique" you mean that any distinct String has a distinct hashCode(), it
is not guaranteed.
Have you read the Javadocs for hashCode()? The API docs are very often a
perfect first place to find such answers.
<http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode()>
> It is /not/ required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results.
(emph. orig.)
You can examine the particular algorithm for String at
<http://java.sun.com/javase/6/docs/api/java/lang/String.html#hashCode()>
Even without Javadocs it is easy to prove that the hash codes for Strings
cannot be unique for all possible String values. There are only 2**32
possible hashCode() results. Any String longer than two characters has a
larger value set than that. The sum of the value set counts for Strings of
any length is larger still.
Gotta love those Javadocs.

Signature
Lew
Robert Klemme - 02 Jun 2007 18:53 GMT
> Is a hashvalue for a string using the function String.hashCode() always
> unique.
Hash values are by definition not guaranteed to be unique.
http://en.wikipedia.org/wiki/Hash_function#Properties
Regards
robert
Tom Hawtin - 02 Jun 2007 20:36 GMT
> Hash values are by definition not guaranteed to be unique.
>
> http://en.wikipedia.org/wiki/Hash_function#Properties
There are hash functions that are guaranteed to be unique.
http://en.wikipedia.org/wiki/Perfect_hashing
For instance, Boolean.hashCode returns a unique hash for each (both)
possible values.
Tom Hawtin
Robert Klemme - 02 Jun 2007 20:47 GMT
>> Hash values are by definition not guaranteed to be unique.
>>
[quoted text clipped - 3 lines]
>
> http://en.wikipedia.org/wiki/Perfect_hashing
Right. Thanks for refreshing my CS knowledge!
> For instance, Boolean.hashCode returns a unique hash for each (both)
> possible values.
Which doesn't seem too difficult given the size of the domain. :-)
I believe though, that in practice usually imperfect hashing is used -
especially if you consider that Java collections that use hashing
internally cannot know from what domain values will be.
Kind regards
robert
Tom Hawtin - 02 Jun 2007 20:35 GMT
> Is a hashvalue for a string using the function String.hashCode() always
> unique.
The String returned by Integer.toString(int) for all possible values of
the argument have unique values. However there are String values that
cannot be returned by Integer.toString(int). As hashCode returns an int,
clearly there has to be some clashes somewhere.
Tom Hawtin
Phi - 03 Jun 2007 19:36 GMT
Bruintje Beer schrieb:
> Hi,
>
> Is a hashvalue for a string using the function String.hashCode() always
> unique.
>
> John
Hi
For two references pointing to an "equal()" String, the hash value must
be equal.
But be aware, that the old JVM (1.0.2) used a different algorithm for
"hashCode()" than the new (1.6).
I have forgotten, in which version it realy changed; but if you want to
have a unique hash-value that is unique accross different
Java-platforms, copy the source of a "good" hash code (eg VM 1.6.0) into
your hash-sensitive code.
phi