Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / June 2007

Tip: Looking for answers? Try searching our database.

Distinct ID Number Per Object?

Thread view: 
Hal Vaughan - 16 Jun 2007 06:12 GMT
I have a case where I'll need distinct and printable names to use in a
reference table.  I'd like to make it so each object, whether it's of the
same class as any other object or not, can produce a distinct number.  It
looks like if I get the hashcode for any object, the JVM attempts to give
each object a unique hashcode, but it doesn't seem to guarantee it.

Is there any way to get a unique string or number for each object that is
created by a particular JVM?

Thanks!

Hal
Stefan Ram - 16 Jun 2007 06:34 GMT
>Is there any way to get a unique string or number for each
>object that is created by a particular JVM?

 The size of a string is bounded by the finite memory the JVM
 can allocate. With, for example, 4 GB of memory, a string
 might be at most 2^32 Bytes long, so there are at most
 2^(2^40) unique strings of this length.

 On the other hand, the loop

while( true )new java.lang.Object();

 can run for more than 2^(2^40) iterations, as memory is
 reclaimed by the garbage collector. Thus it can create an
 unlimited number of new objects, especially 2^(2^40)+1.  

 In this case, there are not enough distinct strings
 possible in this JVM to give each of these 2^(2^40)+1 objects
 a unique string.
Hal Vaughan - 16 Jun 2007 07:13 GMT
>>Is there any way to get a unique string or number for each
>>object that is created by a particular JVM?
[quoted text clipped - 15 lines]
>   possible in this JVM to give each of these 2^(2^40)+1 objects
>   a unique string.

So is it only in extreme cases like this where hashcodes would be
duplicated?

Hal
Twisted - 16 Jun 2007 07:51 GMT
> So is it only in extreme cases like this where hashcodes would be
> duplicated?

Not necessarily, but it's unlikely you'll run out of unique ones in a
production environment.

I suggest you use System.identityHashCode(Object) to get these
numbers. It should be a) fixed for an object's lifetime in one session
(it will change when the object is serialized and later deserialized);
b) globally unique (within the one JVM anyway) as the usual
implementation of the default hash code for Object is the memory
address of that object, which is necessarily globally unique in that
scope; and c) not subject to being overridden unlike calling
hashCode() on the object. This of course works if you need a globally
unique ID per object, even to the point of two copies of a single
object (so equals()) have distinct such IDs. Try to just use hashCode
otherwise.

A second option is to create an IdentityHashMap<Object, Integer> and
stuff objects into that. Distinct objects act as distinct keys (even
if equals()) and you'd assign a new higher integer to each one. Use
Long if you run out of Integers (unlikely). Use BigDecimal if you run
out of Longs (unlikely until we have converted most of the visible
universe into computronium). This has the benefit of giving out ever-
higher numbers even to objects that use the memory space where an
earlier object was before being garbage collected. You can't twiddle
the Object constructor to put everything in automatically, but you can
fake it by having the get-ID method you plan to use actually assign an
ID to objects that don't already have one lazily when it's first
requested. Of course, the objects won't get garbage collected unless
you use a WeakReference, in which case you may as well use an ordinary
HashMap<WeakReference<Object>, Integer> as the WeakReference hashCode
is the default one if I recall correctly. (WeakHashMap would cause
distinct objects that compare equal to have the same ID.) This is
complicated because you need to carry the WeakReference around with
you to look up in the HashMap (you can't just make a new one to the
same object and expect it to work). Perhaps a better option is to wrap
objects that will need IDs in a dummy object that has a single public
Object field, the default equals and hashCode, and make a
WeakHashMap<Dummy, Integer>. The wrapper object has to be used in
place of the original object on any path that leads to getting its ID.

Finally, if this ID is only needed for objects of classes you control,
you can make the base classes you control generate unique ID numbers
to put in a public final field. This is easiest if you can derive all
the classes for which you have this requirement from a single base.
Otherwise, the ID can be a long with 32 bits a sequentially-assigned
int and 32 bits the hashCode of the particular base class.
Hal Vaughan - 16 Jun 2007 08:09 GMT
Wow!  Thanks for such a complete answer.  I learned a LOT from your post!

Hal

>> So is it only in extreme cases like this where hashcodes would be
>> duplicated?
[quoted text clipped - 44 lines]
> Otherwise, the ID can be a long with 32 bits a sequentially-assigned
> int and 32 bits the hashCode of the particular base class.
Lew - 16 Jun 2007 16:51 GMT
>>> So is it only in extreme cases like this where hashcodes would be
>>> duplicated?

Hash codes have even fewer values than Strings.  That means there must be
proportionately more collisions.  Have you read the Javadocs on the hashCode()
method?  You should.  Also read the Javadocs on Map, HashMap and IdentityHashMap.

As Twisted pointed out, the "Identity", i.e., the internal "address" of an
object, is unique for the lifetime of that object.  Even without
IdentityHashMap, any Map can use an object that doesn't override equals()
(most custom objects, for example) as a unique key into a lookup.  It is
sufficient to use a regular Map (e.g., HashMap) when equals() and == define
the same relation.  IdentityHashMap is for when they differ and you want the
key selection to be based on ==.

Twisted said:
> as the usual implementation of the default hash code for Object is the memory address of that object,

"converted to an integer".
<http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode()>
> (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

You certainly cannot rely on a correspondence.  That is what Sun's
implementation of Object.hashCode() does, but many, many subclasses override
it.  It is a best practice (see /Effective Java/ by Josh Bloch) to override
hashCode() in any class that overrides equals().  Since most of the objects in
an application likely are of subtypes of Object, it is common that their
hashCode() will not return the "address" of the object.

Javadocs rule.

Signature

Lew

Hal Vaughan - 16 Jun 2007 17:35 GMT
>>>> So is it only in extreme cases like this where hashcodes would be
>>>> duplicated?
[quoted text clipped - 4 lines]
> method?  You should.  Also read the Javadocs on Map, HashMap and
> IdentityHashMap.

I did.  The part that concerned me was this: "It is not required that if two
objects are unequal according to the equals(java.lang.Object) method, then
calling the hashCode method on each of the two objects must produce
distinct integer results."

That's why I was asking about whether they were unique within a particular
runtime.

> As Twisted pointed out, the "Identity", i.e., the internal "address" of an
> object, is unique for the lifetime of that object.  Even without
[quoted text clipped - 19 lines]
> override
> it.  

It's in one of my own classes, so I'm not concerned about it being
overridden.

> It is a best practice (see /Effective Java/ by Josh Bloch) to
> override
> hashCode() in any class that overrides equals().  

That part I did find, but I won't be overridding either one.

> Since most of the
> objects in an application likely are of subtypes of Object, it is common
> that their hashCode() will not return the "address" of the object.

I don't need to separate all objects.  I have a set of data tables that all
have a master table, but then they have sub tables that are tracked subsets
of the master tables.  I need to make sure that if I create a tracked
table, it has a different name from all the other tracked tables.  I have
one particular class that will be generating names for those tracked tables
on its own and I want to make sure that if I create, say, 5 instances of
that class, that each separate instance will create names that are
different than the names created by the other instances.

I don't need an object's address or anything, I just want to be sure that
each instance of this one class has some kind of unique ID I can use to
specify unique names for the tracked tables.

Hal
Stefan Ram - 16 Jun 2007 17:51 GMT
>I don't need an object's address or anything, I just want to be
>sure that each instance of this one class has some kind of
>unique ID I can use to specify unique names for the tracked
>tables.

 This could also be solved by a counter singleton invoked
 upon creation of such an instance. The instance then
 uses its unique counter value as a prefix for its IDs.
Hal Vaughan - 16 Jun 2007 18:13 GMT
>>I don't need an object's address or anything, I just want to be
>>sure that each instance of this one class has some kind of
[quoted text clipped - 4 lines]
>   upon creation of such an instance. The instance then
>   uses its unique counter value as a prefix for its IDs.

It took me a bit to think through this.  Do you mean making a static int and
each instance uses it as an ID, then increments it for the next one?
That's what I got, or rather, worked out.

Thanks!

Hal
Stefan Ram - 16 Jun 2007 18:30 GMT
>It took me a bit to think through this.  Do you mean making a
>static int and each instance uses it as an ID, then increments
>it for the next one?  That's what I got, or rather, worked out.

class globalCounter { private static int value = 0;
 public static int getValue(){ return value++; }}

class Identifier
{ final private java.lang.String prefix;
 private int count;
 public Identifier()
 { this.prefix = java.lang.String.valueOf( globalCounter.getValue() );
   this.count = 0; }
 public java.lang.String get()
 { return prefix + "-" + java.lang.String.valueOf( count++ ); }}

public class Main
{ final static java.lang.String lineSeparator =
 java.lang.System.getProperty( "line.separator" );
 public static void main( final java.lang.String[] args )  
 { final Identifier identifier0 = new Identifier();
   final Identifier identifier1 = new Identifier();
   java.lang.System.out.println
   ( identifier0.get() + lineSeparator +
     identifier0.get() + lineSeparator +
     identifier0.get() + lineSeparator +
     identifier1.get() + lineSeparator +
     identifier1.get() + lineSeparator +
     identifier1.get() + lineSeparator ); }}

0-0
0-1
0-2
1-0
1-1
1-2
Hal Vaughan - 16 Jun 2007 19:20 GMT
>>It took me a bit to think through this.  Do you mean making a
>>static int and each instance uses it as an ID, then increments
[quoted text clipped - 32 lines]
> 1-1
> 1-2

Okay, I got one working.  Thanks!

Hal
Stefan Ram - 16 Jun 2007 21:19 GMT
>> class globalCounter { private static int value = 0;
>>   public static int getValue(){ return value++; }}
[quoted text clipped - 4 lines]
>>   { this.prefix = java.lang.String.valueOf( globalCounter.getValue() );
>Okay, I got one working.  Thanks!

 In the meantime, I became aware of the fact, that
 it would be more simple to put the field

private static int value = 0;

 directly into the class »Identifier« and eliminate
 the class »globalCounter«.
Hal Vaughan - 16 Jun 2007 21:30 GMT
>>> class globalCounter { private static int value = 0;
>>>   public static int getValue(){ return value++; }}
[quoted text clipped - 12 lines]
>   directly into the class »Identifier« and eliminate
>   the class »globalCounter«.

That's how I thought of it.  I might have seen it before and forgotten it,
since it came so easily to me.  I did this:

protected static int componentID = 0;

protected int myID = componentID++;

This was in the superclass for all the subclasses that need distinct IDs.
It covers more than my original idea, but that helps in the long run.

Hal
Andreas Leitgeb - 18 Jun 2007 09:18 GMT
> That's how I thought of it.  I might have seen it before and forgotten it,
> since it came so easily to me.  I did this:
>
> protected static int componentID = 0;
>
> protected int myID = componentID++;

Are you aware, that this is *not* thread-safe?
If you were running multiple threads (which I don't know,
so this warning may be moot), then you risk getting non-unique
IDs!
Tom Hawtin - 18 Jun 2007 09:45 GMT
>> That's how I thought of it.  I might have seen it before and forgotten it,
>> since it came so easily to me.  I did this:
[quoted text clipped - 7 lines]
> so this warning may be moot), then you risk getting non-unique
> IDs!

That has come up in different replies.

But anyway: It's extremely fragile to assume that you are running in
only a single thread, and therefore write *thread-hostile* code. (It
also doesn't help to use an access modifier other than private, or to
miss off final.) Using synchronized, or better AtomicInteger (or better
AtomicLong), adds very little cost while not requiring this small,
low-level piece of code depending upon large scale assumptions. The
thing about making unnecessary, hidden assumptions is that they often
turn out, or become, incorrect.

Tom Hawtin
Hal Vaughan - 18 Jun 2007 19:16 GMT
>> That's how I thought of it.  I might have seen it before and forgotten
>> it,
[quoted text clipped - 8 lines]
> so this warning may be moot), then you risk getting non-unique
> IDs!

All the classes are created in a single thread *before* control is turned
over to Swing.

Hal
Lew - 16 Jun 2007 19:06 GMT
Lew wrote:

> The part that concerned me was this: "It is not required that if two
> objects are unequal according to the equals(java.lang.Object) method, then
[quoted text clipped - 3 lines]
> That's why I was asking about whether they were unique within a particular
> runtime.

They aren't, necessarily.  It depends on the hashCode() method of the object
in question.

>> You certainly cannot rely on a correspondence.  That is what Sun's
>> implementation of Object.hashCode() does, but many, many subclasses
[quoted text clipped - 3 lines]
> It's in one of my own classes, so I'm not concerned about it being
> overridden.

I don't understand.  If you control the hashCode() then you know what it does.
 Where does the question come from?

>> It is a best practice (see /Effective Java/ by Josh Bloch) to
>> override
[quoted text clipped - 14 lines]
> that class, that each separate instance will create names that are
> different than the names created by the other instances.

But names aren't hash codes.  By definition, a hash reduces the size of the
value set from the domain to the range.

> I don't need an object's address or anything, I just want to be sure that
> each instance of this one class has some kind of unique ID I can use to
> specify unique names for the tracked tables.

So create unique names.  Your issue has nothing to do with hashCode().

If you override equals, let's say to guarantee that two objects with the same
name are considered equal, i.e., to "mean" the same real-world object (in your
case, a "table"), then also override hashCode().

Why not just override those methods so that any time two objects with the same
name, which may be different objects in the JVM, are understood to refer to
the same table?  This is a more normal idiom and should do everything you
need.  Then you can use normal Maps to map the name to the object that models
the table.

class TableModel
{
  private final String name;
  public TableModel( String n )
  {
    name = n;
  }
  public final String getName() { return name; }
  // other attributes
}

then in some other code
  Map <String, TableModel> tables = new HashMap <String, TableModel> ();
  public TableModel put( TableModel table )
  {
    return tables.put( table.getName(), table );
  }

If you need a table object at a later time in your code, obtain
  tables.get( name )
given the name of the table you want.  (If you get null, create a TableModel
and  put() it into the Map.)

This idiom might save you the trouble of UUID generation.

Signature

Lew

Hal Vaughan - 16 Jun 2007 19:27 GMT
> Lew wrote:
>
[quoted text clipped - 20 lines]
> does.
>   Where does the question come from?

I need a unique ID for each instance of the class.  Yes, I can control the
hashCode() method, but I wanted to know if it would be unique.  At this
point, I've taken Stefan Ram's suggestion and used a static int and each
time an instance is created this int is used as the ID number and the
static int is incremented for the next one.

...
>> I don't need an object's address or anything, I just want to be sure that
>> each instance of this one class has some kind of unique ID I can use to
>> specify unique names for the tracked tables.
>
> So create unique names.  Your issue has nothing to do with hashCode().

Now I see there are other ways to do this.  When I looked through the
Javadocs, the hashCode() function was the only thing I found that I thought
would give a unique id for each class created.

> If you override equals, let's say to guarantee that two objects with the
> same name are considered equal, i.e., to "mean" the same real-world object
[quoted text clipped - 4 lines]
> refer to
> the same table?  

Each table could contain a different subset, so if two objects with the same
name were created, they would use the same table, but they might need
separate tables.

> This is a more normal idiom and should do everything you
> need.  Then you can use normal Maps to map the name to the object that
[quoted text clipped - 17 lines]
>      return tables.put( table.getName(), table );
>    }

If I follow this correctly, then one issue is that I have to be sure that
each time an instance of this class is used, I have to make sure it is
passed a unique number as an ID.  That means keeping track of those
numbers.  I'm using a number of different modules, some I know I'll be
adding months or years from now, so I'm doing as much as possible inside
the classes I'm doing now so later I can create them and use them without
the need to go through much in docs.  When I do use them, I will likely
have only a short time to put them in place in a new module, so I'm making
them them as easy as possible to use.  Essentially, I'm frontloading the
work.  More work now isn't fun, but it means when I have to quickly put
together a new module using these classes, I'll hardly have to remember a
thing or look up much in the Javadocs I create.

Hal
Eric Sosman - 16 Jun 2007 19:28 GMT
>>>> So is it only in extreme cases like this where hashcodes would be
>>>> duplicated?
[quoted text clipped - 6 lines]
> As Twisted pointed out, the "Identity", i.e., the internal "address" of
> an object, is unique for the lifetime of that object.  [...]

    Can you find this guarantee in the Javadoc or other
authoritative place?  Does this rule out 64-bit JVM's?

Signature

Eric Sosman
esosman@acm-dot-org.invalid

Lew - 16 Jun 2007 20:17 GMT
>>>>> So is it only in extreme cases like this where hashcodes would be
>>>>> duplicated?
[quoted text clipped - 9 lines]
>     Can you find this guarantee in the Javadoc or other
> authoritative place?  Does this rule out 64-bit JVM's?

You mean the guarantee that an object's "address" is unique during its
lifetime?  How else would the JVM find a particular instance?  In other words,
how could it possibly not be?

If two objects had the same "address", then a reference using that "address"
would not reference a single object, which contradicts the very definition of
an object reference.
<http://java.sun.com/docs/books/jls/third_edition/html/execution.html#12.5>
> a reference to the newly created object is returned as the result [of] the indicated constructor

There is no way for one "address" to point to two objects simultaneously.

This question has nothing to do with bit width, AFAICS.  I'm not really sure
how there could even be a question here.

Signature

Lew

Daniel Dyer - 16 Jun 2007 20:39 GMT
>>>>>> So is it only in extreme cases like this where hashcodes would be
>>>>>> duplicated?
[quoted text clipped - 24 lines]
> This question has nothing to do with bit width, AFAICS.  I'm not really  
> sure how there could even be a question here.

I think Eric's point is that the number of possible hash codes is smaller  
than the number of objects that can be addressed in a 64-bit JVM.  
Therefore System.identityHashCode(Object) cannot, in a 64-bit VM at least,  
guarantee to return unique values for all objects on the heap.

Dan.

Signature

Daniel Dyer
http//www.uncommons.org

Lew - 16 Jun 2007 20:55 GMT
>>>>>>> So is it only in extreme cases like this where hashcodes would be
>>>>>>> duplicated?
[quoted text clipped - 30 lines]
> JVM.  Therefore System.identityHashCode(Object) cannot, in a 64-bit VM
> at least, guarantee to return unique values for all objects on the heap.

The fact that he didn't mention that method in his question, but instead
referenced my comments about "address", meant it would've taken quite the leap
for me to understand that context.

Nothing in Eric's post makes mention of System.identityHashCode(Object).  How
did you infer it?

Signature

Lew

Lew - 16 Jun 2007 21:11 GMT
Daniel Dyer wrote:
>> I think Eric's point is that the number of possible hash codes is
>> smaller than the number of objects that can be addressed in a 64-bit
>> JVM.  Therefore System.identityHashCode(Object) cannot, in a 64-bit VM
>> at least, guarantee to return unique values for all objects on the heap.

> The fact that he didn't mention that method in his question, but instead
> referenced my comments about "address", meant it would've taken quite
> the leap for me to understand that context.
>
> Nothing in Eric's post makes mention of
> System.identityHashCode(Object).  How did you infer it?

Perhaps you meant that Eric meant to refute Twisted's assertion, not
referenced in his post but repeated here:
> I suggest you use System.identityHashCode(Object) to get these
> numbers. It should be a) fixed for an object's lifetime in one session
[quoted text clipped - 4 lines]
> scope; and c) not subject to being overridden unlike calling
> hashCode() on the object. This of course works if you need a globally

b) is wrong on two counts.  There is no guarantee of uniqueness within the
JVM, an application or otherwise to Object.hashCode(), as its docs clearly
state.  Also, it is not correct that the "usual implementation of the default
hash code for Object is the memory address of that object".  a) is simply a
rehash of the Javadocs' comments.  (Pun intended.)

Why he would quote unrelated comments to refute that point, only he can say,
if indeed that is what happened.  I took his remarks at face value.

Signature

Lew

Twisted - 16 Jun 2007 23:00 GMT
[snip]

Uh-oh. Some people are apparently on the warpath again, and I've been
attacked and accused of stuff.

Disregard all disparaging comments directed towards me. None of them
are true. They are to be ignored.

Regarding identityHashCode() -- I have it on good authority than the
Sun JVM implementation, and the typical implementation, uses the RAM
address of the object's handle (which isn't moved by compacting gc).
This address is necessarily unique for objects of overlapping
lifetime. The 32 bit code derived from it is not guaranteed unique on
a 64-bit system, but the odds of a collision are still extremely
minuscule unless the system is vastly larger than any current hardware
can cope with. (The running Java app had to occupy a gig or more with
just the objects that need unique IDs before a collision is remotely
likely.)

Of course the slight risk might still be intolerable. As with the risk
that occurs when using a JVM that may not use the RAM address, it's
probably also quite small. The RAM address of the handle is a free
collisionless hash on 32-bit architectures, and still gives a very
good hash distribution on 64-bit ones, so it is difficult to imagine
why anyone would implement a JVM to do something more complicated that
probably gives a poorer distribution of hashes and more collisions,
unless it was radically different in its guts, say not even having a
handle with a pointer to the class and a pointer to the instance, and
then it's hard to see how they could make GC work...

Regardless, the OP has since revealed that they control a base class
of the classes that need the IDs, which makes it simple to solve their
problem with zero risk of collisions. The method was also mentioned in
my earlier post, though this fact seems to have gone unacknowledged:

public class Base {
   public final long id; // should stay unique even on 64-bit
architectures, or with long running systems
   private static long idGenerator;
   public Base () {
       synchronized (Base.class) {
           id = idGenerator;
           idGenerator++;
       }
   }
   ...
}

If you don't construct Base instances in more than one thread at a
time, you can dump the synchronization. Otherwise it is needed to
prevent race conditions with accessing and incrementing idGenerator,
which could result in two objects getting the same id at the same
time, and the next id in sequence being skipped.
Twisted - 16 Jun 2007 23:14 GMT
> The RAM address of the handle is a free
> collisionless hash on 32-bit architectures, and still gives a very
> good hash distribution on 64-bit ones, so it is difficult to imagine
> why anyone would implement a JVM to do something more complicated that
> probably gives a poorer distribution of hashes and more collisions.

Actually, it occurs to me on rereading that to get a better
distribution on 64-bit architectures, you:
a) use the handle's address on 32 bit architectures; and
b) use bits 35 to 4 on 64 bit architectures.

This is because on 64 bit architectures, you'd be implementing your
JVM with a 16-byte handle (two eight-byte pointers, one to the class
and one to the instance) and for speed aligning them on 16-byte
boundaries, and surely packing them densely in a particular part of
memory regardless. This means you can drop the low-order 4 bits from
the handle address as probably-all-zeros and surely-all-the-same; they
don't contribute any distinctness to hash values derived from the
address. After that, using the least significant remaining 32 bits
gives best results, as anything but a huge system has a lot of zeros
in the high order bits that also contribute no distinctiveness to the
hash values. On the other hand, zeros creeping into bit 35 or below
isn't a worry as it just means you have less than 2^32 objects -- and
if the handles are contiguous in memory, this actually means there
will not be any collisions at all.

(And remember that as far as the machine/CPU is concerned, addresses
are just integers of some size or another; distinctions between int
and pointer or reference are artifacts of the higher level language.)
Hal Vaughan - 17 Jun 2007 00:25 GMT
...
> public class Base {
>     public final long id; // should stay unique even on 64-bit
[quoted text clipped - 8 lines]
>     ...
> }

There may be nothing to this, but, as I've said in this thread before, and
said on this group many times, being self taught, I know there are many
things I've missed.  Is there any particular reason for you using this:

     id = idGenerator;
     idGenerator++;

Instead of this:

     id = idGenerator++;

> If you don't construct Base instances in more than one thread at a
> time, you can dump the synchronization. Otherwise it is needed to
> prevent race conditions with accessing and incrementing idGenerator,
> which could result in two objects getting the same id at the same
> time, and the next id in sequence being skipped.

While this is working with Swing, all the objects are created before the
first interactive window opens, so, no, race conditions are not an issue.

Hal
Daniel Dyer - 17 Jun 2007 00:47 GMT
> ...
>> public class Base {
[quoted text clipped - 21 lines]
>
>       id = idGenerator++;

The first example is less confusing.  The single-line variant is modifying  
two variables.  And if you don't think the second example has the  
potential for confusion, you may be surprised that is not semantically  
equivalent to the first (it does something different).

You might want to write a little program to demonstrate the difference  
between these two assignments.

    id = idGenerator++;

    id = ++idGenerator;

Anyway, for a simpler version of the same idea implemented in Twisted's  
code, just use java.util.concurrent.atomic.AtomicLong.  It deals with the  
synchronisation and incrementing for you.

Dan.

Signature

Daniel Dyer
http//www.uncommons.org

Daniel Dyer - 17 Jun 2007 00:51 GMT
>> There may be nothing to this, but, as I've said in this thread before,  
>> and
[quoted text clipped - 12 lines]
> the potential for confusion, you may be surprised that is not  
> semantically equivalent to the first (it does something different).

Sorry that's completely wrong.

It is equivalent but, as my post aptly demonstrates ;), it's not as  
obvious.

Dan.

Signature

Daniel Dyer
http//www.uncommons.org

Daniel Dyer - 17 Jun 2007 00:37 GMT
> [snip]
>
> Uh-oh. Some people are apparently on the warpath again, and I've been
> attacked and accused of stuff.

I'd hardly call it an attack.  Lew's comments addressed his disagreement  
with what you wrote, he said nothing about you yourself.  This is a forum  
for discussion, disagreements are to be expected.

> Disregard all disparaging comments directed towards me. None of them
> are true. They are to be ignored.

Right...

> Regarding identityHashCode() -- I have it on good authority than the
> Sun JVM implementation, and the typical implementation, uses the RAM
> address of the object's handle (which isn't moved by compacting gc).

The Javadocs for Object.hashcode() say:

"As much as is reasonably practical, the hashCode method defined by class  
Object does return distinct integers for distinct objects. (This is  
typically implemented by converting the internal address of the object  
into an integer, but this implementation technique is not required by the  
JavaTM programming language.)"

Which seems to be entirely consistent with what you are saying.  So I'm  
not sure where Lew is coming from when he says:

> it is not correct that the "usual implementation of the defaulthash code  
> for Object is the memory address of that object"

Unless he merely means that the value returned is not necessarily the  
address itself but is trivially derived from the address.

Dan.

Signature

Daniel Dyer
http//www.uncommons.org

Lew - 17 Jun 2007 00:53 GMT
> "As much as is reasonably practical, the hashCode method defined by
> class Object does return distinct integers for distinct objects. (This
[quoted text clipped - 4 lines]
> Which seems to be entirely consistent with what you are saying.  So I'm
> not sure where Lew is coming from when he says:

Because they do not say the algorithm for "converting the internal address".
Which part of the JVM address would they use?  The handle, which comprises a
pointer to two other pointers in some Sun implementations?  The pointer
values?  The heap offset into which one of the pointers points?  The pointer
to the class area?  All these ingredients are necessary to make up a real
"address" in the JVM, but only an int appears in the hashCode() output.
Also, notice the words, "As much as is reasonably practical".  They are
telling you right in the Javadocs that it is not a guarantee.

Once again, the int of a hashCode() and the "address" of an object have
different structures, different interpretations and different semantic spaces.

>> it is not correct that the "usual implementation of the defaulthash
>> code for Object is the memory address of that object"

It isn't.  One implementation is to derive the int from some part of the
"address" of the object by an unspecified algorithm.  The derived int is not
the same thing as the source "address".

> Unless he merely means that the value returned is not necessarily the
> address itself but is trivially derived from the address.

Maybe trivially, maybe not, but certainly derived from, and not equivalent to.
 It can't be.  As others pointed out, the theoretical space of addresses is
larger than can be represented in 32 bits.

All of this goes to what others on this thread have also pointed out, that
identityHashCode() and the underlying Object.hashCode() cannot be relied upon
to achieve a guaranteed unique handle for an object.

Signature

Lew

Mark Thornton - 17 Jun 2007 08:56 GMT
> Regarding identityHashCode() -- I have it on good authority than the
> Sun JVM implementation, and the typical implementation, uses the RAM
> address of the object's handle (which isn't moved by compacting gc).
I don't think the Sun JVM's have used object handles for many years. The
early garbage collectors did work that way.

Mark Thornton
Twisted - 17 Jun 2007 09:11 GMT
On Jun 17, 3:56 am, Mark Thornton <mark.p.thorn...@ntl-spam-world.com>
wrote:
> > Regarding identityHashCode() -- I have it on good authority than the
> > Sun JVM implementation, and the typical implementation, uses the RAM
> > address of the object's handle (which isn't moved by compacting gc).
>
> I don't think the Sun JVM's have used object handles for many years. The
> early garbage collectors did work that way.

JVMs tend to either use handles, or they directly rewrite referring
objects' reference pointers. In the latter case, the object's own RAM
address is likely to be used to set its hash code when it's created,
but then the object may get moved and the hash code stay the same. If
that's the case, and a new object gets created in the same location,
the code might recur on concurrent lifetime objects. VMs like that are
probably better off using a hash of the clock time when the object was
created to set its identityHashCode. This is unlikely to show
collisions until you make tens of millions of objects, if done well.

In any event, the whole discussion is moot, since the OP did turn out
to have control of the classes whose objects need identifying, and can
just stick that snippet of increment-static-int-and-assign-to-public-
final-field code in his base class(es). And use a long if he expects
to have billions of objects or more during one JVM session with his
app. Zero collisions, mathematically guaranteed unless the app tries
to produce 2^64 + 1 or more objects, which would likely take until the
sun went out, if not longer.
Lew - 17 Jun 2007 15:03 GMT
> On Jun 17, 3:56 am, Mark Thornton <mark.p.thorn...@ntl-spam-world.com>
> wrote:
[quoted text clipped - 8 lines]
> address is likely to be used to set its hash code when it's created,
> but then the object may get moved and the hash code stay the same.

Proving further that the hash code is not the same as the "address".  The
"address" is permitted to change, but the hashCode() result is not.  This was
part of my evidence when I pointed out
> Nor is the "address" required to remain "constant" during the lifetime of the object, unlike the return value of hashCode().

Different things, different semantic spaces.

Signature

Lew

Tom Hawtin - 18 Jun 2007 04:48 GMT
>> Regarding identityHashCode() -- I have it on good authority than the
>> Sun JVM implementation, and the typical implementation, uses the RAM
>> address of the object's handle (which isn't moved by compacting gc).
> I don't think the Sun JVM's have used object handles for many years. The
> early garbage collectors did work that way.

Absolutely.

Handles were briefly resurrected for the earliest HotSpot
implementations. On typical, modern, serious Java implementations,
objects do get moved in memory and have no fixed handles.

Even for 32-bit JVMs, identity hash codes do clash. If you look at the
values produced, they clearly aren't sensible addresses - odd numbers,
for instance. I believe that it is true that values are typically
*derived* from the address of the object at the time the hash is first
requested.

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6321873

I can't believe this nonsense keeps turning up...

Tom Hawtin
Lew - 18 Jun 2007 05:31 GMT
>>> Regarding identityHashCode() -- I have it on good authority than the
>>> Sun JVM implementation, and the typical implementation, uses the RAM
[quoted text clipped - 17 lines]
>
> I can't believe this nonsense keeps turning up...

Excellent link, and an interesting little program that proves the point that
> It appears straightforward to find two live objects sharing the same identityHashCode, using only a bog standard JRE. Real implementations with unique identityHashCodes for live objects, I would expect to be confined to legacy J2ME JVMs (if they had System.identityHashCode).

It seems so clear /a priori/ that hashCode() return values and so-called
"internal addresses" are such completely different beasts in such completely
different worlds that no one should be able to confuse the two.  It's also
curious that people focus on the phrase "the internal address of the object"
and not "converting ... into an integer" in the Javadoc description:
> This is typically implemented by converting the internal address of the object into an integer,
> but this implementation technique is not required by the JavaTM programming language.

Really, that part of the Javadoc description should just be deleted.  In fact,
they should get rid of
> As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects.

It is imprecise and misleading.  What they should say is something like, "A
good hash code approaches uniform distribution of values across the int range."

Signature

Lew

Eric Sosman - 16 Jun 2007 21:23 GMT
>>>>>>>> So is it only in extreme cases like this where hashcodes would be
>>>>>>>> duplicated?
[quoted text clipped - 39 lines]
> Nothing in Eric's post makes mention of
> System.identityHashCode(Object).  How did you infer it?

    The inferential error, if there was one, was mine: When
you wrote about the "Identity" of an Object, I assumed you
meant its System.identityHashCode() value.  My assumption was
(I thought) strengthened when you described the "Identity" as
"the internal `address'" of the object, which matches the
highly suggestive (but not 100% prescriptive) Javadoc.  Did
you have some other way to find "the internal `address'" of a
Java object?  System.hashCode() is the closest thing I can
think of, but there are many things I haven't thought of ...

    And yes, the point of my question was that a 64-bit JVM
can (given enough heap) create more distinct objects than
there are hashCode() or identityHashCode() values.  I was
attempting what's known as "Socratic questioning;" Socrates
was evidently better at it than I am -- and he got poisoned
for it, so perhaps I ought to quit while I still have the
option ...

Signature

Eric Sosman
esosman@acm-dot-org.invalid

Lew - 16 Jun 2007 22:24 GMT
>     The inferential error, if there was one, was mine: When
> you wrote about the "Identity" of an Object, I assumed you
> meant its System.identityHashCode() value.  My assumption was

Oh, I see.  No, I'd've said the method name in full if I meant it.

There is nothing in the System.identityHashCode() Javadocs to suggest that it
returns unique values even in a 32-bit JVM.  I was referring to the rather
complex and unspecified actual JVM "address", to which I refer always in
quotes because it is not an address as thought of in many other machine
architectures.

> (I thought) strengthened when you described the "Identity" as
> "the internal `address'" of the object, which matches the

Yes, but not the System.identityHashCode().

> highly suggestive (but not 100% prescriptive) Javadoc.  Did

I don't think it's "highly" suggestive at all.  They say they implement the
Object.hashCode() method as a conversion to int from the "internal address" of
an object.  They clearly do not specify that conversion, nor what they mean by
the "internal address".  A cursory examination of how such "internal
addresses" are implemented reveals, for example, that Sun's implementation is
a pointer to a table of a pair of pointers to locations in the class area and
the heap.  Clearly this is "converted" to int via an algorithm that, as has
been documented and restated, reduces the value set from the domain ("internal
addresses") to the range (int).  There is nothing in the Sun document from
which one can conclude that this conversion results in a unique value; /au
contraire/ the documents for hashCode(), and transitively
System.identityHashCode(), explicitly warn that the value cannot be guaranteed
to be unique.  Again, 32-bit or 64-bit makes no difference.

> you have some other way to find "the internal `address'" of a
> Java object?

No, nor was this a topic in this thread.  There was a misstatement upthread
that the "internal address" is somehow equivalent to the hashCode(), but that
is neither my nor Sun's fault.

I find it generally irrelevant to look for this "address".  It is enough to
hold a reference in a variable.

>  System.hashCode() is the closest thing I canthink of, but there are many things I haven't thought of ...

I assume you mean Object.hashCode(), which System.identityHashCode() invokes.

There is nothing in the return value of either method that is structurally or
meaningfully similar to the address of the object.  These methods return an
int; a JVM "address" is certainly not an int, nor conceptually representable
as a one-to-one mapping to the int range.  Nor is the "address" required to
remain "constant" during the lifetime of the object, unlike the return value
of hashCode().  The two are in completely different semantic spaces.

>     And yes, the point of my question was that a 64-bit JVM
> can (given enough heap) create more distinct objects than
> there are hashCode() or identityHashCode() values.  I was

So does a 32-bit JVM, potentially.  Neither method guarantees a unique result,
within the JVM, within an application, or within a moment.  That is why both
HashMap and IdentityHashMap have to resolve collisions.  Even in a 32-bit
environment.

However, the object's "address", whatever that is, is perforce unique.
Stefan Ram - 16 Jun 2007 17:46 GMT
>I suggest you use System.identityHashCode(Object) to get these
>numbers. It should be a) fixed for an object's lifetime in one
>session (it will change when the object is serialized and later
>deserialized); b) globally unique (within the one JVM anyway)
>as the usual implementation of the default hash code for Object
>is the memory

http://download.java.net/jdk7/docs/api/java/lang/System.html#identityHashCode(ja
va.lang.Object
)

 is not guaranteed to be »unique« by the documentation, this is
 only a property of some implementations (as you have written
 yourself).

 On a 32-bit-system, two objects with non-overlapping lifetime
 might share the same address.

 On a 64-bit-system, even two objects with overlapping lifetime
 might need to share the same 32-bit identity hash code.

 The original poster might explain, what it is that he wants to
 accomplish with the unique ID, as this might provide better
 answers.
Stefan Ram - 16 Jun 2007 18:18 GMT
>implementation of the default hash code for Object is the memory
>address of that object, which is necessarily globally unique in that

 Readers are encouraged to run the following program (and be
 prepared to wait several minutes, but not more than 15, for
 the output) and then report the outcome here.

public class Main
{ final static java.lang.String lineSeparator =
 java.lang.System.getProperty( "line.separator" );
 public static void main( final java.lang.String[] args )  
 { final java.lang.Object object = new java.lang.Object();
   final int code = object.hashCode();
   java.lang.Object object1;
   int code1;
   do
   { code1 =( object1 = new java.lang.Object() ).hashCode(); }
   while( code1 != code );
   java.lang.System.out.print
   (( object == object1 )+ lineSeparator +
     code + lineSeparator +
     code1 + lineSeparator ); }}
Hal Vaughan - 16 Jun 2007 18:27 GMT
> public class Main
> { final static java.lang.String lineSeparator =
[quoted text clipped - 11 lines]
> code + lineSeparator +
> code1 + lineSeparator ); }}

What kind of safeguards does Java have in place so this doesn't overload my
CPU or RAM?

Hal
Bent C Dalager - 18 Jun 2007 09:56 GMT
>What kind of safeguards does Java have in place so this doesn't overload my
>CPU or RAM?

Unless you override it, the Java VM will stop at 64 (or is it 128?) MB
of RAM usage and produce OutOfMemoryException if it still needs more.
This will tend to terminate the program. CPU usage is best controlled
via your OS (set the process to low priority, up its nice value, etc.,
depending on which OS you're using).

Cheers
    Bent D
Signature

Bent Dalager - bcd@pvv.org - http://www.pvv.org/~bcd
                                   powered by emacs

Mark Thornton - 16 Jun 2007 21:38 GMT
> I suggest you use System.identityHashCode(Object) to get these
> numbers. It should be a) fixed for an object's lifetime in one session
[quoted text clipped - 3 lines]
> address of that object, which is necessarily globally unique in that
> scope; and c) not subject to being overridden unlike calling

While the objects address may be unique (amongst objects existing at a
given time), the value returned by System.identityHashCode is NOT
guaranteed to be unique. Indeed in some cases it couldn't be. The
hashCode is a 32 bit integer, but a 64 bit VM could have more than 2^32
objects, in which case some of those objects would have the same hash code.

Mark Thornton
Stefan Ram - 16 Jun 2007 21:51 GMT
>While the objects address may be unique (amongst objects existing at a
>given time), the value returned by System.identityHashCode is NOT
>guaranteed to be unique. Indeed in some cases it couldn't be. The
>hashCode is a 32 bit integer, but a 64 bit VM could have more than 2^32
>objects, in which case some of those objects would have the same hash code.

 I am somewhat disappointed, that no one has yet
 reported about results from the program I posted in

<identityHashCode-20070616191650@ram.dialup.fu-berlin.de>

public class Main
{ final static java.lang.String lineSeparator =
 java.lang.System.getProperty( "line.separator" );
 public static void main( final java.lang.String[] args )  
 { final java.lang.Object object = new java.lang.Object();
   final int code = object.hashCode();
   java.lang.Object object1;
   int code1;
   do
   { code1 =( object1 = new java.lang.Object() ).hashCode(); }
   while( code1 != code );
   java.lang.System.out.print
   (( object == object1 )+ lineSeparator +
     code + lineSeparator +
     code1 + lineSeparator ); }}
Hal Vaughan - 16 Jun 2007 22:53 GMT
>>While the objects address may be unique (amongst objects existing at a
>>given time), the value returned by System.identityHashCode is NOT
[quoted text clipped - 5 lines]
>   I am somewhat disappointed, that no one has yet
>   reported about results from the program I posted in

One point here: I'm self taught.  There's a lot I know I don't know.  I
don't see what limits the loop.  Just how much will this do and will it
slow down a system or anything like that?  I'm asking because I don't have
a spare system to test anything on now.

Hal
Stefan Ram - 16 Jun 2007 23:05 GMT
>One point here: I'm self taught.  There's a lot I know I don't
>know.  I don't see what limits the loop.  Just how much will
>this do and will it slow down a system or anything like that?
>I'm asking because I don't have a spare system to test anything
>on now.

 Usually, an operating system allows one to terminate a JVM at
 any time chosen.
Hal Vaughan - 17 Jun 2007 00:21 GMT
>>One point here: I'm self taught.  There's a lot I know I don't
>>know.  I don't see what limits the loop.  Just how much will
[quoted text clipped - 4 lines]
>   Usually, an operating system allows one to terminate a JVM at
>   any time chosen.

Yes, but sometimes a program grabs so many resources that it can take a long
time to type in "kill -9 ----".

I'll try it when I don't have the IDE up and I'm trying to get this current
job finished.

Hal
Stefan Ram - 16 Jun 2007 17:12 GMT
>So is it only in extreme cases like this where hashcodes would
>be duplicated?

 The hashcode only needs to fulfill the requirements of

http://download.java.net/jdk7/docs/api/java/lang/Object.html#hashCode()

 These requirements are compatible with an implementation
 that always returns the same value for each object.
 
 The hashcode of an object depends on the class.
 Therefore, if the class of an object is not known,
 one can not assert more than the above requirements.
Karl Uppiano - 16 Jun 2007 17:23 GMT
>I have a case where I'll need distinct and printable names to use in a
> reference table.  I'd like to make it so each object, whether it's of the
[quoted text clipped - 4 lines]
> Is there any way to get a unique string or number for each object that is
> created by a particular JVM?

UUIDs are sometimes used for applications like this, as long as you remain
cognizant of the possible dynamic range and/or performance limitations.
However, it represents a 128-bit value, and java.util.UUID.randomUUID has
been fast enough for my needs.

http://en.wikipedia.org/wiki/UUID

http://java.sun.com/javase/6/docs/api/index.html
rossum - 16 Jun 2007 21:30 GMT
>I have a case where I'll need distinct and printable names to use in a
>reference table.  I'd like to make it so each object, whether it's of the
[quoted text clipped - 8 lines]
>
>Hal
One possibility that has not yet been mentioned in this discussion is
to use a cryptographic hash or Message Digest.  SHA-256 produces a 256
bit hash, so collisions are much less likely that with a 32 bit
integer hash.  The cryptographic hash function will run rather more
slowly than the integer hash function though - nothing comes for free.

Depending on the providers available in your Java implementation have
a look at:

MD5: 128 bit output

SHA-1: 160 bit output

SHA-256: 256 bit output

Both MD5 and SHA-1 have cryptographic weaknesses, but they are fine
for non-cryptographic purposes.  MD5 is less slow than either of the
SHA's

rossum
tony - 17 Jun 2007 05:03 GMT
use the time when the obj is created as its ID can ensure its uniqueness
Karl Uppiano - 17 Jun 2007 10:13 GMT
> use the time when the obj is created as its ID can ensure its uniqueness

That only works if the objects are created slower than the resolution of the
timebase. Otherwise, you will get the same value for objects created in
rapid succession. For local uniqueness, a concatenation of time and serial
number (a simple counter) should be sufficient.

A UUID, depending on the generation algorithm, uses time, and/or the
computer's MAC address and/or random number generation to create unique
values. In this way, even objects created on different machines may be
combined into common database tables with near zero risk of collisions.
Plus, UUIDs are eminently printable, which was a requirement of the OP.
Mark Thornton - 17 Jun 2007 10:20 GMT
>> use the time when the obj is created as its ID can ensure its uniqueness

> Plus, UUIDs are eminently printable, which was a requirement of the OP.

But not at all memorable. Using a printed UUID for any purpose is tedious.

Mark Thornton
Karl Uppiano - 17 Jun 2007 19:40 GMT
>>> use the time when the obj is created as its ID can ensure its uniqueness
>
>> Plus, UUIDs are eminently printable, which was a requirement of the OP.
>
> But not at all memorable. Using a printed UUID for any purpose is tedious.

Probably no more tedious than any large number that will undoubtedly result
if you have to keep track of more than a few thousand instantiations.
Roedy Green - 29 Jun 2007 16:32 GMT
On Sat, 16 Jun 2007 01:12:31 -0400, Hal Vaughan
<hal@thresholddigital.com> wrote, quoted or indirectly quoted someone
who said :

>Is there any way to get a unique string or number for each object that is
>created by a particular JVM?
The hash is almost unique.
--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
Roedy Green - 29 Jun 2007 16:33 GMT
On Sat, 16 Jun 2007 01:12:31 -0400, Hal Vaughan
<hal@thresholddigital.com> wrote, quoted or indirectly quoted someone
who said :

>Is there any way to get a unique string or number for each object that is
>created by a particular JVM?

You can do it sequence numbering your objects in the constructor.
--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.