Java Forum / General / September 2007
Can I compare references (in a sense of compareTo method)?
chucky - 20 Sep 2007 22:24 GMT When testing two objects for equality (with == operator), the references are compared. But it is not allowed to compare the objects (references) with <, <=, >, >= operators.
Is it possible somehow to get the address of an object and then compare it?
What I want to do is to have each instance of my class unique, so that two objects are equal only if they are the same object. This is easy, I can make equals method to compare references. But I would like to use these objects in TreeSet/TreeMap and therefore implement the compareTo method in a way that would be consistent with respect to equals.
I can think of one workaround: the class having an additional integer field id and creating the instances with a synchronized factory method. But just want to know if it is possible without that extra field.
Thanks for replies!
Tomas
Daniel Pitts - 20 Sep 2007 22:42 GMT > When testing two objects for equality (with == operator), the > references are compared. But it is not allowed to compare the objects [quoted text clipped - 18 lines] > > Tomas See System.identityHashCode(object)
Now, the questions remains, if they don't have inherent order, why use them in a TreeSet or TreeMap? Why not a standard HashSet/HashMap? The main benefit of Tree* is that it maintains the order for you.
chucky - 21 Sep 2007 08:26 GMT Thank you guys, seems you have answered my question indirectly -- there is no way to get the address of an object :-).
> Now, the questions remains, if they don't have inherent order, why use > them in a TreeSet or TreeMap? Why not a standard HashSet/HashMap? > The main benefit of Tree* is that it maintains the order for you. I don't care about the actual ordering, but I wanted to use Tree* since I thought it would have smaller memory overhead than Hash*. I don't care much about logarithmic complexity (which is btw. guaranteed, while Hash operations don't guarantee constant complexity), since I want to use it for small collections (<20 elements).
Why is HashSet/HashMap more standard than TreeSet/TreeMap?
T.
Lasse Reichstein Nielsen - 21 Sep 2007 11:48 GMT > I don't care about the actual ordering, but I wanted to use Tree* > since I thought it would have smaller memory overhead than Hash*. Not by much. A TreeMap has a node per element with (probably) four or five references and a boolean, whereas a HashMap has a lookup array of references and a node per element with three references (a single-linked list).
...
> Why is HashSet/HashMap more standard than TreeSet/TreeMap? It doesn't require its keys to be Comparable, or have a Comparator.
/L
 Signature Lasse Reichstein Nielsen - lrn@hotpop.com DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html> 'Faith without judgement merely degrades the spirit divine.'
Lew - 21 Sep 2007 13:45 GMT chucky writes:
>> Why is HashSet/HashMap more standard than TreeSet/TreeMap?
> It doesn't require its keys to be Comparable, or have a Comparator. chucky writes:
>> I don't care about the actual ordering, but I wanted to use Tree* >> since I thought it would have smaller memory overhead than Hash*. I >> don't care much about logarithmic complexity (which is btw. >> guaranteed, while Hash operations don't guarantee constant >> complexity), since I want to use it for small collections (<20 >> elements). "Premature optimization is the root of all evil." - Knuth
You are mistaken about "Hash [sic] operations don't guarantee constant complexity".
<http://java.sun.com/javase/6/docs/api/java/util/HashMap.html>
> This implementation provides constant-time performance for > the basic operations (get and put), assuming the hash function > disperses the elements properly among the buckets. Whereas <http://java.sun.com/javase/6/docs/api/java/util/TreeMap.html>
> This implementation provides guaranteed log(n) time cost > for the containsKey, get, put and remove operations.
 Signature Lew
Chris Smith - 21 Sep 2007 14:44 GMT > You are mistaken about "Hash [sic] operations don't guarantee constant > complexity". I don't agree with this. If one has chosen a bad hash function for the particular distribution of objects in use, then some operations on HashMap will degrade to linear time, which is *far* worse than logarithmic. The chance of this is very small, but since hash functions in Java are constant and chosen at build time, it's always feasible for any type of data in which there are far more possible values of that data structure than there are hash buckets.
There are techniques for avoiding this: one can randomize hash functions to get a true probablistic "expected constant time", but Java's has functions don't do that. (I suppose one could do it, but it would be quite uncommon). And one can generate perfect hash functions given data ahead of time, but that's not normally done either. Certainly someone just saying "use HashMap" can't be interpreted as talking about either of these techniques; so in practice, HashMap does not guarantee its amortized constant running time.
Regarding the Knuth quote, it's important to note that from context, Knuth was talking about "small" inefficiencies; i.e., if this is the defining characteristic of the problem, then it may be quite worthwhile to consider in advance if "probably constant, depending on input" or "always logarithmic" is better.
 Signature Chris Smith
Lew - 21 Sep 2007 14:52 GMT > ... if this is the > defining characteristic of the problem, then it may be quite worthwhile > to consider in advance if "probably constant, depending on input" or > "always logarithmic" is better. You are correct. They also have to weigh if "additional complexity introduced by the need to spuriously define a Comparator" is a worthwhile cost. Perhaps the effort is better spent writing a good implementation of hashCode() for the key class.
I'd weigh the costs of possible O(n) response time with odds of what, ten to the negative umpteenth?, against the cost of difficulties from spurious algorithmic warts, with near certainty if the program is maintained. For nearly all problems I predict the mathematical expectation favors HashMap with a well-designed key hashCode() override, over TreeMap with the additional maintenance cost of imputing an order.
 Signature Lew
Patricia Shanahan - 21 Sep 2007 14:56 GMT > Thank you guys, seems you have answered my question indirectly -- > there is no way to get the address of an object :-). It isn't even guaranteed that an object has an address.
>> Now, the questions remains, if they don't have inherent order, why use >> them in a TreeSet or TreeMap? Why not a standard HashSet/HashMap? [quoted text clipped - 8 lines] > > Why is HashSet/HashMap more standard than TreeSet/TreeMap? They are both completely standard, but they are different classes with different purposes. The Tree structures are designed for ordered access, and you are having to stretch to try to apply them to your unordered objects.
For very small collections, I would also consider ArrayList.
Patricia
Daniel Pitts - 21 Sep 2007 16:18 GMT > Thank you guys, seems you have answered my question indirectly -- > there is no way to get the address of an object :-). [quoted text clipped - 13 lines] > > T. Hash doesn't have a huge memory overhead, and generally you shouldn't worry about such things until they become a problem.
If you declare your sets/maps using the interface "Set<MyThing>" and "Map<MyThing, OtherThing>", then you can switch between different implementations.
HashSet and HashMap are the default choice, because they are on average O(1) operations. TreeSet and TreeMap have a very specific purpose, and that purpose is to guarantee and ordering of elements.
chucky - 21 Sep 2007 19:31 GMT > If you declare your sets/maps using the interface "Set<MyThing>" and > "Map<MyThing, OtherThing>", then you can switch between different > implementations. And then I will have to implement Comparable or provide a Comparator:-)
Anyway, I got my answer already, thank you for sharing your ideas!
Tomas
Roedy Green - 22 Sep 2007 20:48 GMT >I don't care about the actual ordering, but I wanted to use Tree* >since I thought it would have smaller memory overhead than Hash* I think you need to do an experiment to verify that. I suspect the opposite since you would need a node per element in both, with the TreeSet having in addition the tree structure, where the HashSet has a simple array of pointers.
It seems obvious to me that a HashSet should be FASTER than a TreeSet. You go right to the element with a single division or mask, rather than chasing down a multi-level index structure.
 Signature Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Lew - 22 Sep 2007 21:16 GMT >> I don't care about the actual ordering, but I wanted to use Tree* >> since I thought it would have smaller memory overhead than Hash* [quoted text clipped - 7 lines] > You go right to the element with a single division or mask, rather > than chasing down a multi-level index structure. People are concerned with the corner case, wherein the hash distributes the (presumably random) input so imperfectly that linear search time at the appropriate hash bucket exceeds the log(n) performance of the tree search. The argument seems to be that TreeFoo's guarantee of O(log(n)) search complexity pays off since HashFoo would degrade to O(n) in that case.
Either the input is sufficiently random that the odds of this happening are acceptable, given a good general-purpose hash, or there are expectations that would cause a general hash to distribute values poorly, in which case one should override hashCode() with a better choice of algorithm for the expected input shape.
TreeFoo suffers insertion performance degradation compared to HashFoo, too, doesn't it? So not only would the effort of providing and maintaining a Comparator have to exceed that of writing a good hashCode(), and the odds that the input would have the shape to trigger the scenario exceed the worthwhileness threshold, but the Foo would have to be frequently read, rarely written, to make the Tree version perform better than the Hash version.
Given the fragility of semantically superfluous code introduced for putative performance gains absent hard evidence of such gain, the simplicity of an alternative solution that is likely to handle the issue, and the so-far high uncertainty that the nature of the data will even remotely justify the attention, I recommend avoiding TreeFoo for this particular matter.
 Signature Lew
Knute Johnson - 20 Sep 2007 22:43 GMT > When testing two objects for equality (with == operator), the > references are compared. But it is not allowed to compare the objects [quoted text clipped - 18 lines] > > Tomas Do you not have access to the docs? It has nothing to do with == or the address of the object. You need to look at TreeMap and Comparator.
public class TreeMap<K,V> extends AbstractMap<K,V> implements NavigableMap<K,V>, Cloneable, Serializable
A Red-Black tree based NavigableMap implementation. The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time, depending on which constructor is used.
 Signature Knute Johnson email s/nospam/knute/
Daniel Pitts - 20 Sep 2007 23:10 GMT > > When testing two objects for equality (with == operator), the > > references are compared. But it is not allowed to compare the objects [quoted text clipped - 34 lines] > Knute Johnson > email s/nospam/knute/ Whether or not the OP reads the docs, you should try reading the post before you reply.
He has an object that he wants its equals identity to match its == identity. He also wants to use said object (for some bewildering reason) in Tree based collections.
He's probably misguided in that approach, and I've tried to suggest alternatives to him.
Knute Johnson - 21 Sep 2007 00:55 GMT >>> When testing two objects for equality (with == operator), the >>> references are compared. But it is not allowed to compare the objects [quoted text clipped - 31 lines] > Whether or not the OP reads the docs, you should try reading the post > before you reply. I did. He wants to compare unique objects with their references. And for some unknown reason he wants to order them in a TreeMap.
> He has an object that he wants its equals identity to match its == > identity. He also wants to use said object (for some bewildering > reason) in Tree based collections. That is why I suggested (and supplied the salient part) he read the docs. He obviously doesn't know what a TreeMap is for.
> He's probably misguided in that approach, and I've tried to suggest > alternatives to him. I think so too. And your suggestion would have been excellent had he not had a unique set of objects. A list will do just fine.
In any case == isn't going to be the solution to whatever his problem is. I'll even put five bucks behind that statement.
I think his real problem is in not describing the problem he is attempting to solve but just his method for solving it. This is a common problem that posters on this list have. I know I've been there. It can be very difficult to provide a concise problem description that will get you to where you need to be.
 Signature Knute Johnson email s/nospam/knute/
Daniel Pitts - 21 Sep 2007 01:56 GMT > >>> When testing two objects for equality (with == operator), the > >>> references are compared. But it is not allowed to compare the objects [quoted text clipped - 54 lines] > attempting to solve but just his method for solving it. This is a > common problem that posters on this list have. I know I've been there. I think we all have.
> It can be very difficult to provide a concise problem description that > will get you to where you need to be. It's less difficult if you are able to step back from *how* you want to solve it, and look at *what* you're trying to solve.
> -- > > Knute Johnson > email s/nospam/knute/ Roedy Green - 21 Sep 2007 17:16 GMT >Is it possible somehow to get the address of an object and then >compare it? see http://mindprod.com/jgloss/hashcode.html#OBJECT
 Signature Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
chucky - 21 Sep 2007 19:28 GMT On Sep 21, 6:16 pm, Roedy Green <see_webs...@mindprod.com.invalid> wrote:
> >Is it possible somehow to get the address of an object and then > >compare it? > > see http://mindprod.com/jgloss/hashcode.html#OBJECT As I stated in the title of this topic and in first paragraph of my original post, I was intrested in comparing the addresses in sense of ordering. So the page you linked does not answer the question neither positively nor negatively.
Moreover, it states that "The default hashCode() method uses the 32- bit internal JVM address of the Object as its hashCode." But http://java.sun.com/javase/6/docs/api/java/lang/Object.html#hashCode() says:
"As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)"
So the default hashCode() does not necessarily involve adresses.
Regards, Tomas
Lew - 21 Sep 2007 19:41 GMT >>> Is it possible somehow to get the address of an object and then >>> compare it?
> As I stated in the title of this topic and in first paragraph of my > original post, I was intrested in comparing the addresses in sense of > ordering. So the page you linked does not answer the question neither > positively nor negatively. Take note of what Patricia Shanahan wrote:
> It isn't even guaranteed that an object has an address. Even when an object does have an "address", that "address" can change during the lifetime of the object, for example after a garbage collection.
> So the default hashCode() does not necessarily involve adresses. and cannot, really, for the reasons cited.
I think Sun made a huge mistake referring to "address" in the docs for hashCode(). There is no meaningful numeric interpretation of an object's "address" in the JVM. Even so, the Javadocs do use the terminology "/converting/ the internal address of the object" (emph. added), indicating that even with this undefined concept of "address" it isn't even necessarily a direct correspondence.
The hashCode() normally will not change unless the contents of the object change. That means any tenuous association between the object's "address", whatever that is, and its hashCode() will be broken at the next GC, assuming no intervening changes in the object's values.
And assuming the object hasn't been optimized away entirely, thus eliminating its "address" altogether.
The point is that Java as a language deliberately prevents the programmer from doing direct address manipulation, preferring object-oriented reference semantics. You cannot in general produce a consistent "address" for an object.
 Signature Lew
chucky - 22 Sep 2007 15:27 GMT > Take note of what Patricia Shanahan wrote: > > > It isn't even guaranteed that an object has an address. Oops, seems I'm missing some posts :(. I'm reading this group at google groups and cannot see that message from Patricia :(. The message counter at the top of http://groups.google.com/group/comp.lang.java.programmer/browse_thread/thread/03 94b3b3e68ff139/ shows that there are 18 messages in this thread, but there are actually only 14 shown. Does this group has an archive somewhere else?
T.
Lew - 22 Sep 2007 15:34 GMT >> Take note of what Patricia Shanahan wrote: >> [quoted text clipped - 6 lines] > shows that there are 18 messages in this thread, but there are > actually only 14 shown. Does this group has an archive somewhere else? Once again Google Groups causes trouble.
There are just /so/ many reports of people having difficulties with Google Groups.
Does your ISP have a news server? Mine (the local cable company) does, through Giganews, and my subscription for 'net access includes something like 2 GiB downloads from news per month. I use Mozilla Thunderbird (free) for access to it.
From everything I've read about it, I'd neither use nor recommend Google Groups. In fact, if I were Google I'd be embarrassed to have my name associated with it.
As to other archives, I believe they exist but in sort of the same way I belief there's life on other planets. It makes sense that something exists but I've got no direct evidence. Someone else, or perhaps a quick GIYF search, could tell you.
 Signature Lew
Lew - 22 Sep 2007 15:55 GMT >>> Take note of what Patricia Shanahan wrote: >>> [quoted text clipped - 12 lines] > There are just /so/ many reports of people having difficulties with > Google Groups. This one might not be against GG. The message I quoted from Patricia, and a few that were in its subthread, are gone from my server / client as well.
 Signature Lew
Tomas Mikula - 22 Sep 2007 16:33 GMT >>>> Take note of what Patricia Shanahan wrote: >>>> [quoted text clipped - 15 lines] > This one might not be against GG. The message I quoted from Patricia, and a > few that were in its subthread, are gone from my server / client as well. Anyway, I installed Pan as Joshua suggested and connected to the university's news server and see more messages than I could see on GG.
Tomas
Joshua Cranmer - 22 Sep 2007 15:38 GMT >> Take note of what Patricia Shanahan wrote: >> [quoted text clipped - 8 lines] > > T. Google for "free news servers" and hook up to one of them from a competent news reader; seeing that you are on Linux, I would recommend Pan as a starting point.
Also check with your ISP to see if they provide a free newsserver.
 Signature Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
Patricia Shanahan - 21 Sep 2007 19:49 GMT > On Sep 21, 6:16 pm, Roedy Green <see_webs...@mindprod.com.invalid> > wrote: [quoted text clipped - 6 lines] > ordering. So the page you linked does not answer the question neither > positively nor negatively. If you really want to use TreeSet to do HashSet's job, it does not matter whether the order is an actual address order or not, as long as it is consistent.
The real problem is the risk that the hashCode() results will not be unique. I believe it is a theoretical possibility in a 32 bit JVM. However, in a 64 bit JVM there could be more objects of a given type than there are int values.
Patricia
chucky - 22 Sep 2007 15:34 GMT > > On Sep 21, 6:16 pm, Roedy Green <see_webs...@mindprod.com.invalid> > > wrote: [quoted text clipped - 15 lines] > However, in a 64 bit JVM there could be more objects of a given type > than there are int values. I didn't mean to compare hashCodes. I didn't realize that the address can change in lifetime of an object, so now I'm discouraged enough from wanting to obtain it.
Lew - 22 Sep 2007 15:37 GMT > I didn't mean to compare hashCodes. I didn't realize that the address > can change in lifetime of an object, so now I'm discouraged enough > from wanting to obtain it. It not so much that the address can change in the JVM, as that the concept of "address" in Java is not numeric. The more precise statement is that there is no consistent, unique number that can be intrinsically derived from the address.
Objects have referential identity, so in some sense there is an unchangeable "address" to it; it's just not a number.
 Signature Lew
Roedy Green - 22 Sep 2007 20:45 GMT >The more precise statement is that there >is no consistent, unique number that can be intrinsically derived from the >address. Presumably there is a table indexed by object number pointing to the current address of an object, at least in a simple-minded JVM. That index number would give you a permanent object id. The catch is, when an object were garbage collected, a new object could recycle its slot. So your ids would be unique in time, but not unique over the life of the job.
If you want unique ids that will work on any platform, including native compilation, I think you will need to get the constructor do:
this.id = TheClass.id++;
in the constructor.
 Signature Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Lew - 22 Sep 2007 21:22 GMT > If you want unique ids that will work on any platform, including > native compilation, I think you will need to get the constructor do: > > this.id = TheClass.id++; > > in the constructor. That static var might need synchronization, if the class is meant to be thread-safe. Pre-emptively, one might wrap the get-and-increment in a static method for present or future synchronization, or conversion of id to AtomicInteger.
 Signature Lew
Piotr Kobzda - 28 Sep 2007 22:30 GMT >> If you want unique ids that will work on any platform, including >> native compilation, I think you will need to get the constructor do: [quoted text clipped - 7 lines] > static method for present or future synchronization, or conversion of id > to AtomicInteger. Even synchronizing it may not prevent from non-unique ids of two (or more) distinct objects. When not all objects becomes garbage immediately after use, the other short-time living objects may cause overflow of the id.
Using time as part of the id may make it better (see java.util.UUID), but even generation of ids like that requires extreme patience on corner cases.
piotr
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|