Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / October 2003

Tip: Looking for answers? Try searching our database.

Set / List

Thread view: 
Andrew Smith - 18 Oct 2003 02:32 GMT
The most common implementation of Set is HashSet. Why did they create this
set backed by a Map, when they could have extended ArrayList and just
overridden the add(Object obj) method ?
Dave Benjamin - 18 Oct 2003 02:45 GMT
> The most common implementation of Set is HashSet. Why did they create this
> set backed by a Map, when they could have extended ArrayList and just
> overridden the add(Object obj) method ?

Because it makes for an interesting question to ask on homework problems?

Signature

.:[ dave benjamin (ramenboy) -:- www.ramenfest.com -:- www.3dex.com ]:.

: d r i n k i n g   l i f e   o u t   o f   t h e   c o n t a i n e r :
Andrew Smith - 18 Oct 2003 03:10 GMT
> > The most common implementation of Set is HashSet. Why did they create this
> > set backed by a Map, when they could have extended ArrayList and just
> > overridden the add(Object obj) method ?
>
> Because it makes for an interesting question to ask on homework problems?

It might do. However I'm curious because the HashSet source code uses a Map
and puts in dummy Object instances as the values. The only thing I can think
of is that it is quicker to check if a Hash contains a specified object than
an array backed List.
Chris Smith - 18 Oct 2003 03:15 GMT
> It might do. However I'm curious because the HashSet source code uses a Map
> and puts in dummy Object instances as the values. The only thing I can think
> of is that it is quicker to check if a Hash contains a specified object than
> an array backed List.

That's exactly it.  The Collections API is very concerned with the order
of complexity assigned to various operations.  Using hashing as the
store for a Set makes it much faster to check if a specific value is
already there (amortized constant time, in fact), which is a common
operation for a Set.

What advantage do you see for using a List?

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Robert Olofsson - 18 Oct 2003 10:08 GMT
: That's exactly it.  The Collections API is very concerned with the order
: of complexity assigned to various operations.  Using hashing as the
: store for a Set makes it much faster to check...

: What advantage do you see for using a List?

Less memory use. A hash has an array where it stores Entries,
Each entry has at least two references. A list only has the array
with references to the values.

If memory is the problem it is easy to write a ListSet that uses a
list as a backend. I would think that speed is preferd for nearly all
applications though.

/robo
Chris Smith - 18 Oct 2003 14:26 GMT
> : That's exactly it.  The Collections API is very concerned with the order
> : of complexity assigned to various operations.  Using hashing as the
[quoted text clipped - 9 lines]
> list as a backend. I would think that speed is preferd for nearly all
> applications though.

I suppose, but it's important to remember that we're talking about
substantially different natures of advantage.  The time advantage is a
difference of complexity.  The space advantage is a difference of a
constant factor.  Hence, the space advantage *might* occasionally matter
for programs that are right on the border of being fast enough.  The
time advantage makes a difference in the very nature of problems for
which the structure is appropriate.

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Mark Thornton - 18 Oct 2003 15:12 GMT
>>: That's exactly it.  The Collections API is very concerned with the order
>>: of complexity assigned to various operations.  Using hashing as the
[quoted text clipped - 17 lines]
> time advantage makes a difference in the very nature of problems for
> which the structure is appropriate.

A List is likely to be faster if the expected number of entries is very
small (e.g. ~ 1). I have some code where the maximum number of entries
is 4 (and there only 16 different sets possible).

Mark Thornton
Chris Smith - 18 Oct 2003 15:34 GMT
> A List is likely to be faster if the expected number of entries is very
> small (e.g. ~ 1). I have some code where the maximum number of entries
> is 4 (and there only 16 different sets possible).

Fine.  Again, though, only by a constant factor.  That's a whole
different ball of wax than a difference in asymptotic complexity.

(Incidentally, it seems offhand like a BitSet might be a better
abstraction for your case, anyway.)

Signature

www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Mark Thornton - 18 Oct 2003 16:02 GMT
>>A List is likely to be faster if the expected number of entries is very
>>small (e.g. ~ 1). I have some code where the maximum number of entries
[quoted text clipped - 5 lines]
> (Incidentally, it seems offhand like a BitSet might be a better
> abstraction for your case, anyway.)

You can't put objects into a BitSet even if the universe contains only 4
objects. Also the number might grow in future so I want to keep the
implementation as an internal detail.

Mark Thornton
Randall R Schulz - 20 Oct 2003 05:47 GMT
Robert,

You're right about the sub-optimality of the Sun HashSet
implementation. In part it is dictated by some of the Map API.
(In particular, the entrySet() method of interface Map.)

For an implementation that is cheaper for most of the Map operations,
check out GNU Trove / Trove4J: <http://trove4j.sourceforge.net/>

Randall Schulz

> : That's exactly it.  The Collections API is very concerned with the order
> : of complexity assigned to various operations.  Using hashing as the
[quoted text clipped - 11 lines]
>
> /robo
Thomas G. Marshall - 20 Oct 2003 04:36 GMT
Chris Smith <cdsmith@twu.net> coughed up the following:

>> It might do. However I'm curious because the HashSet source code
>> uses a Map and puts in dummy Object instances as the values. The
[quoted text clipped - 7 lines]
> already there (amortized constant time, in fact), which is a common
> operation for a Set.

Yes.  To the point, a set contains only one of anything.  In order to
enforce that with a List the class would have to iterate through itself on
every add(), to ensure no duplicates.

...[thwack]...


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.