Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / September 2007

Tip: Looking for answers? Try searching our database.

Can I compare references (in a sense of compareTo method)?

Thread view: 
chucky - 20 Sep 2007 22:24 GMT
When testing two objects for equality (with == operator), the
references are compared. But it is not allowed to compare the objects
(references) with <, <=, >, >= operators.

Is it possible somehow to get the address of an object and then
compare it?

What I want to do is to have each instance of my class unique, so that
two objects are equal only if they are the same object. This is easy,
I can make equals method to compare references. But I would like to
use these objects in TreeSet/TreeMap and therefore implement the
compareTo method in a way that would be consistent with respect to
equals.

I can think of one workaround: the class having an additional integer
field id and creating the instances with a synchronized factory
method. But just want to know if it is possible without that extra
field.

Thanks for replies!

Tomas
Daniel Pitts - 20 Sep 2007 22:42 GMT
> When testing two objects for equality (with == operator), the
> references are compared. But it is not allowed to compare the objects
[quoted text clipped - 18 lines]
>
> Tomas

See System.identityHashCode(object)

Now, the questions remains, if they don't have inherent order, why use
them in a TreeSet or TreeMap?  Why not a standard HashSet/HashMap?
The main benefit of Tree* is that it maintains the order for you.
chucky - 21 Sep 2007 08:26 GMT
Thank you guys, seems you have answered my question indirectly --
there is no way to get the address of an object :-).

> Now, the questions remains, if they don't have inherent order, why use
> them in a TreeSet or TreeMap?  Why not a standard HashSet/HashMap?
> The main benefit of Tree* is that it maintains the order for you.

I don't care about the actual ordering, but I wanted to use Tree*
since I thought it would have smaller memory overhead than Hash*. I
don't care much about logarithmic complexity (which is btw.
guaranteed, while Hash operations don't guarantee constant
complexity), since I want to use it for small collections (<20
elements).

Why is HashSet/HashMap more standard than TreeSet/TreeMap?

T.
Lasse Reichstein Nielsen - 21 Sep 2007 11:48 GMT
> I don't care about the actual ordering, but I wanted to use Tree*
> since I thought it would have smaller memory overhead than Hash*.

Not by much. A TreeMap has a node per element with (probably) four or
five references and a boolean, whereas a HashMap has a lookup array of
references and a node per element with three references (a
single-linked list).

...

> Why is HashSet/HashMap more standard than TreeSet/TreeMap?

It doesn't require its keys to be Comparable, or have a Comparator.

/L
Signature

Lasse Reichstein Nielsen  -  lrn@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
 'Faith without judgement merely degrades the spirit divine.'

Lew - 21 Sep 2007 13:45 GMT
chucky writes:
>> Why is HashSet/HashMap more standard than TreeSet/TreeMap?

> It doesn't require its keys to be Comparable, or have a Comparator.

chucky writes:
>> I don't care about the actual ordering, but I wanted to use Tree*
>> since I thought it would have smaller memory overhead than Hash*. I
>> don't care much about logarithmic complexity (which is btw.
>> guaranteed, while Hash operations don't guarantee constant
>> complexity), since I want to use it for small collections (<20
>> elements).

"Premature optimization is the root of all evil." - Knuth

You are mistaken about "Hash [sic] operations don't guarantee constant
complexity".

<http://java.sun.com/javase/6/docs/api/java/util/HashMap.html>
> This implementation provides constant-time performance for
> the basic operations (get and put), assuming the hash function
> disperses the elements properly among the buckets.

Whereas
<http://java.sun.com/javase/6/docs/api/java/util/TreeMap.html>
> This implementation provides guaranteed log(n) time cost
> for the containsKey, get, put and remove  operations.

Signature

Lew

Chris Smith - 21 Sep 2007 14:44 GMT
> You are mistaken about "Hash [sic] operations don't guarantee constant
> complexity".

I don't agree with this.  If one has chosen a bad hash function for the
particular distribution of objects in use, then some operations on
HashMap will degrade to linear time, which is *far* worse than
logarithmic.  The chance of this is very small, but since hash functions
in Java are constant and chosen at build time, it's always feasible for
any type of data in which there are far more possible values of that
data structure than there are hash buckets.

There are techniques for avoiding this: one can randomize hash functions
to get a true probablistic "expected constant time", but Java's has
functions don't do that.  (I suppose one could do it, but it would be
quite uncommon).  And one can generate perfect hash functions given data
ahead of time, but that's not normally done either.  Certainly someone
just saying "use HashMap" can't be interpreted as talking about either
of these techniques; so in practice, HashMap does not guarantee its
amortized constant running time.

Regarding the Knuth quote, it's important to note that from context,
Knuth was talking about "small" inefficiencies; i.e., if this is the
defining characteristic of the problem, then it may be quite worthwhile
to consider in advance if "probably constant, depending on input" or
"always logarithmic" is better.

Signature

Chris Smith

Lew - 21 Sep 2007 14:52 GMT
> ... if this is the
> defining characteristic of the problem, then it may be quite worthwhile
> to consider in advance if "probably constant, depending on input" or
> "always logarithmic" is better.

You are correct.  They also have to weigh if "additional complexity introduced
by the need to spuriously define a Comparator" is a worthwhile cost.  Perhaps
the effort is better spent writing a good implementation of hashCode() for the
key class.

I'd weigh the costs of possible O(n) response time with odds of what, ten to
the negative umpteenth?, against the cost of difficulties from spurious
algorithmic warts, with near certainty if the program is maintained.  For
nearly all problems I predict the mathematical expectation favors HashMap with
a well-designed key hashCode() override, over TreeMap with the additional
maintenance cost of imputing an order.

Signature

Lew

Patricia Shanahan - 21 Sep 2007 14:56 GMT
> Thank you guys, seems you have answered my question indirectly --
> there is no way to get the address of an object :-).

It isn't even guaranteed that an object has an address.

>> Now, the questions remains, if they don't have inherent order, why use
>> them in a TreeSet or TreeMap?  Why not a standard HashSet/HashMap?
[quoted text clipped - 8 lines]
>
> Why is HashSet/HashMap more standard than TreeSet/TreeMap?

They are both completely standard, but they are different classes with
different purposes. The Tree structures are designed for ordered access,
and you are having to stretch to try to apply them to your unordered
objects.

For very small collections, I would also consider ArrayList.

Patricia
Daniel Pitts - 21 Sep 2007 16:18 GMT
> Thank you guys, seems you have answered my question indirectly --
> there is no way to get the address of an object :-).
[quoted text clipped - 13 lines]
>
> T.

Hash doesn't have a huge memory overhead, and generally you shouldn't
worry about such things until they become a problem.

If you declare your sets/maps using the interface "Set<MyThing>" and
"Map<MyThing, OtherThing>", then you can switch between different
implementations.

HashSet and HashMap are the default choice, because they are on
average O(1) operations.  TreeSet and TreeMap have a very specific
purpose, and that purpose is to guarantee and ordering of elements.
chucky - 21 Sep 2007 19:31 GMT
> If you declare your sets/maps using the interface "Set<MyThing>" and
> "Map<MyThing, OtherThing>", then you can switch between different
> implementations.

And then I will have to implement Comparable or provide a
Comparator:-)

Anyway, I got my answer already, thank you for sharing your ideas!

Tomas
Roedy Green - 22 Sep 2007 20:48 GMT
>I don't care about the actual ordering, but I wanted to use Tree*
>since I thought it would have smaller memory overhead than Hash*

I think you need to do an experiment to verify that. I
suspect the opposite since you would need a node per element in both,
with the TreeSet having in addition the tree structure, where the
HashSet has a simple array of pointers.

It seems obvious to me that a HashSet should be FASTER than a TreeSet.
You go right to the element with a single division or mask, rather
than chasing down a multi-level index structure.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Lew - 22 Sep 2007 21:16 GMT
>> I don't care about the actual ordering, but I wanted to use Tree*
>> since I thought it would have smaller memory overhead than Hash*
[quoted text clipped - 7 lines]
> You go right to the element with a single division or mask, rather
> than chasing down a multi-level index structure.

People are concerned with the corner case, wherein the hash distributes the
(presumably random) input so imperfectly that linear search time at the
appropriate hash bucket exceeds the log(n) performance of the tree search.
The argument seems to be that TreeFoo's guarantee of O(log(n)) search
complexity pays off since HashFoo would degrade to O(n) in that case.

Either the input is sufficiently random that the odds of this happening are
acceptable, given a good general-purpose hash, or there are expectations that
would cause a general hash to distribute values poorly, in which case one
should override hashCode() with a better choice of algorithm for the expected
input shape.

TreeFoo suffers insertion performance degradation compared to HashFoo, too,
doesn't it?  So not only would the effort of providing and maintaining a
Comparator have to exceed that of writing a good hashCode(), and the odds that
the input would have the shape to trigger the scenario exceed the
worthwhileness threshold, but the Foo would have to be frequently read, rarely
written, to make the Tree version perform better than the Hash version.

Given the fragility of semantically superfluous code introduced for putative
performance gains absent hard evidence of such gain, the simplicity of an
alternative solution that is likely to handle the issue, and the so-far high
uncertainty that the nature of the data will even remotely justify the
attention, I recommend avoiding TreeFoo for this particular matter.

Signature

Lew

Knute Johnson - 20 Sep 2007 22:43 GMT
> When testing two objects for equality (with == operator), the
> references are compared. But it is not allowed to compare the objects
[quoted text clipped - 18 lines]
>
> Tomas

Do you not have access to the docs?  It has nothing to do with == or the
address of the object.  You need to look at TreeMap and Comparator.

public class TreeMap<K,V>
extends AbstractMap<K,V>
implements NavigableMap<K,V>, Cloneable, Serializable

A Red-Black tree based NavigableMap implementation. The map is sorted
according to the natural ordering of its keys, or by a Comparator
provided at map creation time, depending on which constructor is used.

Signature

Knute Johnson
email s/nospam/knute/

Daniel Pitts - 20 Sep 2007 23:10 GMT
> > When testing two objects for equality (with == operator), the
> > references are compared. But it is not allowed to compare the objects
[quoted text clipped - 34 lines]
> Knute Johnson
> email s/nospam/knute/

Whether or not the OP reads the docs, you should try reading the post
before you reply.

He has an object that he wants its equals identity to match its ==
identity.  He also wants to use said object (for some bewildering
reason) in Tree based collections.

He's probably misguided in that approach, and I've tried to suggest
alternatives to him.
Knute Johnson - 21 Sep 2007 00:55 GMT
>>> When testing two objects for equality (with == operator), the
>>> references are compared. But it is not allowed to compare the objects
[quoted text clipped - 31 lines]
> Whether or not the OP reads the docs, you should try reading the post
> before you reply.

I did.  He wants to compare unique objects with their references.  And
for some unknown reason he wants to order them in a TreeMap.

> He has an object that he wants its equals identity to match its ==
> identity.  He also wants to use said object (for some bewildering
> reason) in Tree based collections.

That is why I suggested (and supplied the salient part) he read the
docs.  He obviously doesn't know what a TreeMap is for.

> He's probably misguided in that approach, and I've tried to suggest
> alternatives to him.

I think so too.  And your suggestion would have been excellent had he
not had a unique set of objects.  A list will do just fine.

In any case == isn't going to be the solution to whatever his problem
is.  I'll even put five bucks behind that statement.

I think his real problem is in not describing the problem he is
attempting to solve but just his method for solving it.  This is a
common problem that posters on this list have.  I know I've been there.
 It can be very difficult to provide a concise problem description that
will get you to where you need to be.

Signature

Knute Johnson
email s/nospam/knute/

Daniel Pitts - 21 Sep 2007 01:56 GMT
> >>> When testing two objects for equality (with == operator), the
> >>> references are compared. But it is not allowed to compare the objects
[quoted text clipped - 54 lines]
> attempting to solve but just his method for solving it.  This is a
> common problem that posters on this list have.  I know I've been there.
I think we all have.
>   It can be very difficult to provide a concise problem description that
> will get you to where you need to be.
It's less difficult if you are able to step back from *how* you want
to solve it, and look at *what* you're trying to solve.

> --
>
> Knute Johnson
> email s/nospam/knute/
Roedy Green - 21 Sep 2007 17:16 GMT
>Is it possible somehow to get the address of an object and then
>compare it?
see http://mindprod.com/jgloss/hashcode.html#OBJECT
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

chucky - 21 Sep 2007 19:28 GMT
On Sep 21, 6:16 pm, Roedy Green <see_webs...@mindprod.com.invalid>
wrote:
> >Is it possible somehow to get the address of an object and then
> >compare it?
>
> see http://mindprod.com/jgloss/hashcode.html#OBJECT

As I stated in the title of this topic and in first paragraph of my
original post, I was intrested in comparing the addresses in sense of
ordering. So the page you linked does not answer the question neither
positively nor negatively.

Moreover, it states that "The default hashCode() method uses the 32-
bit internal JVM address of the Object as its hashCode."
But http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode()
says:

"As much as is reasonably practical, the hashCode method defined by
class Object does return distinct integers for distinct objects. (This
is typically implemented by converting the internal address of the
object into an integer, but this implementation technique is not
required by the JavaTM programming language.)"

So the default hashCode() does not necessarily involve adresses.

Regards,
Tomas
Lew - 21 Sep 2007 19:41 GMT
>>> Is it possible somehow to get the address of an object and then
>>> compare it?

> As I stated in the title of this topic and in first paragraph of my
> original post, I was intrested in comparing the addresses in sense of
> ordering. So the page you linked does not answer the question neither
> positively nor negatively.

Take note of what Patricia Shanahan wrote:
> It isn't even guaranteed that an object has an address.

Even when an object does have an "address", that "address" can change during
the lifetime of the object, for example after a garbage collection.

> So the default hashCode() does not necessarily involve adresses.

and cannot, really, for the reasons cited.

I think Sun made a huge mistake referring to "address" in the docs for
hashCode().  There is no meaningful numeric interpretation of an object's
"address" in the JVM.  Even so, the Javadocs do use the terminology
"/converting/ the internal address of the object" (emph. added), indicating
that even with this undefined concept of "address" it isn't even necessarily a
direct correspondence.

The hashCode() normally will not change unless the contents of the object
change.  That means any tenuous association between the object's "address",
whatever that is, and its hashCode() will be broken at the next GC, assuming
no intervening changes in the object's values.

And assuming the object hasn't been optimized away entirely, thus eliminating
its "address" altogether.

The point is that Java as a language deliberately prevents the programmer from
doing direct address manipulation, preferring object-oriented reference
semantics.  You cannot in general produce a consistent "address" for an object.

Signature

Lew

chucky - 22 Sep 2007 15:27 GMT
> Take note of what Patricia Shanahan wrote:
>
> > It isn't even guaranteed that an object has an address.

Oops, seems I'm missing some posts :(. I'm reading this group at
google groups and cannot see that message from Patricia :(. The
message counter at the top of
http://groups.google.com/group/comp.lang.java.programmer/browse_thread/thread/03
94b3b3e68ff139/

shows that there are 18 messages in this thread, but there are
actually only 14 shown. Does this group has an archive somewhere else?

T.
Lew - 22 Sep 2007 15:34 GMT
>> Take note of what Patricia Shanahan wrote:
>>
[quoted text clipped - 6 lines]
> shows that there are 18 messages in this thread, but there are
> actually only 14 shown. Does this group has an archive somewhere else?

Once again Google Groups causes trouble.

There are just /so/ many reports of people having difficulties with Google
Groups.

Does your ISP have a news server?  Mine (the local cable company) does,
through Giganews, and my subscription for 'net access includes something like
2 GiB downloads from news per month.  I use Mozilla Thunderbird (free) for
access to it.

From everything I've read about it, I'd neither use nor recommend Google
Groups.  In fact, if I were Google I'd be embarrassed to have my name
associated with it.

As to other archives, I believe they exist but in sort of the same way I
belief there's life on other planets.  It makes sense that something exists
but I've got no direct evidence.  Someone else, or perhaps a quick GIYF
search, could tell you.

Signature

Lew

Lew - 22 Sep 2007 15:55 GMT
>>> Take note of what Patricia Shanahan wrote:
>>>
[quoted text clipped - 12 lines]
> There are just /so/ many reports of people having difficulties with
> Google Groups.

This one might not be against GG.  The message I quoted from Patricia, and a
few that were in its subthread, are gone from my server / client as well.

Signature

Lew

Tomas Mikula - 22 Sep 2007 16:33 GMT
>>>> Take note of what Patricia Shanahan wrote:
>>>>
[quoted text clipped - 15 lines]
> This one might not be against GG.  The message I quoted from Patricia, and a
> few that were in its subthread, are gone from my server / client as well.

Anyway, I installed Pan as Joshua suggested and connected to the
university's news server and see more messages than I could see on GG.

Tomas
Joshua Cranmer - 22 Sep 2007 15:38 GMT
>> Take note of what Patricia Shanahan wrote:
>>
[quoted text clipped - 8 lines]
>
> T.

Google for "free news servers" and hook up to one of them from a
competent news reader; seeing that you are on Linux, I would recommend
Pan as a starting point.

Also check with your ISP to see if they provide a free newsserver.

Signature

Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

Patricia Shanahan - 21 Sep 2007 19:49 GMT
> On Sep 21, 6:16 pm, Roedy Green <see_webs...@mindprod.com.invalid>
> wrote:
[quoted text clipped - 6 lines]
> ordering. So the page you linked does not answer the question neither
> positively nor negatively.

If you really want to use TreeSet to do HashSet's job, it does not
matter whether the order is an actual address order or not, as long as
it is consistent.

The real problem is the risk that the hashCode() results will not be
unique. I believe it is a theoretical possibility in a 32 bit JVM.
However, in a 64 bit JVM there could be more objects of a given type
than there are int values.

Patricia
chucky - 22 Sep 2007 15:34 GMT
> > On Sep 21, 6:16 pm, Roedy Green <see_webs...@mindprod.com.invalid>
> > wrote:
[quoted text clipped - 15 lines]
> However, in a 64 bit JVM there could be more objects of a given type
> than there are int values.

I didn't mean to compare hashCodes. I didn't realize that the address
can change in lifetime of an object, so now I'm discouraged enough
from wanting to obtain it.
Lew - 22 Sep 2007 15:37 GMT
> I didn't mean to compare hashCodes. I didn't realize that the address
> can change in lifetime of an object, so now I'm discouraged enough
> from wanting to obtain it.

It not so much that the address can change in the JVM, as that the concept of
"address" in Java is not numeric.  The more precise statement is that there
is no consistent, unique number that can be intrinsically derived from the
address.

Objects have referential identity, so in some sense there is an unchangeable
"address" to it; it's just not a number.

Signature

Lew

Roedy Green - 22 Sep 2007 20:45 GMT
>The more precise statement is that there
>is no consistent, unique number that can be intrinsically derived from the
>address.

Presumably there is a table indexed by object number pointing to the
current address of an object, at least in a simple-minded JVM. That
index number would give you a permanent object id.  The catch is, when
an object were garbage collected, a new object could recycle its slot.
So your ids would be unique in time, but not unique over the life of
the job.

If you want unique ids that will work on any platform, including
native compilation, I think you will need to get the constructor do:

this.id = TheClass.id++;

in the constructor.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Lew - 22 Sep 2007 21:22 GMT
> If you want unique ids that will work on any platform, including
> native compilation, I think you will need to get the constructor do:
>
> this.id = TheClass.id++;
>
> in the constructor.

That static var might need synchronization, if the class is meant to be
thread-safe.  Pre-emptively, one might wrap the get-and-increment in a static
method for present or future synchronization, or conversion of id to
AtomicInteger.

Signature

Lew

Piotr Kobzda - 28 Sep 2007 22:30 GMT
>> If you want unique ids that will work on any platform, including
>> native compilation, I think you will need to get the constructor do:
[quoted text clipped - 7 lines]
> static method for present or future synchronization, or conversion of id
> to AtomicInteger.

Even synchronizing it may not prevent from non-unique ids of two (or
more) distinct objects.  When not all objects becomes garbage
immediately after use, the other short-time living objects may cause
overflow of the id.

Using time as part of the id may make it better (see java.util.UUID),
but even generation of ids like that requires extreme patience on corner
cases.

piotr


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.