Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / December 2006

Tip: Looking for answers? Try searching our database.

Merging Linked Lists

Thread view: 
Damo - 19 Dec 2006 19:38 GMT
Hi,

I (will) have anythng up to 6 Linked Lists of strings. I want to merge
them and remove duplicate entries at the same time. So that I end up
with one Linked List with every node containing a distinct string. I
dont really want(need) to sort the list. Does anyone know how I would
go about doing this efficiently.
Any advice would be much appreciated.
Daniel Dyer - 19 Dec 2006 19:59 GMT
> Hi,
>
[quoted text clipped - 4 lines]
> go about doing this efficiently.
> Any advice would be much appreciated.

Is LinkedList the best data structure for your needs?  Perhaps a Set would  
be better if you need no duplicates and don't care about sorting?

Signature

Daniel Dyer
http://www.uncommons.org

ballpointpenthief - 19 Dec 2006 20:54 GMT
> Hi,
>
[quoted text clipped - 4 lines]
> go about doing this efficiently.
> Any advice would be much appreciated.

Read Knuth Vol 3 (I think).
I expect sorting before merging, then merging if no duplicates will be
more efficient.
Damo - 19 Dec 2006 22:22 GMT
> > Hi,
> >
[quoted text clipped - 8 lines]
> I expect sorting before merging, then merging if no duplicates will be
> more efficient.

But its possible that two of the Linked lists will have the same item.
so even if theyre sorted i only want to merge distinct items from the
second to the first. The only thing I'm sure of is that each list will
not have duplicates within them.
Daniel Pitts - 19 Dec 2006 23:11 GMT
> > > Hi,
> > >
[quoted text clipped - 13 lines]
> second to the first. The only thing I'm sure of is that each list will
> not have duplicates within them.

Is there any reason not to use a Set instead?
Damo - 19 Dec 2006 23:23 GMT
> Is there any reason not to use a Set instead?

Ye, I was just reading about Sets , there and the more I read the more
inclined I am to use them. How are they as regards efficiency as
opposed to linked lists? Efficiency is my main prioity for this project.
Daniel Pitts - 19 Dec 2006 23:49 GMT
> > Is there any reason not to use a Set instead?
>
> Ye, I was just reading about Sets , there and the more I read the more
> inclined I am to use them. How are they as regards efficiency as
> opposed to linked lists? Efficiency is my main prioity for this project.

Well, it certainly depends on what your using them for. However,
efficiency should NEVER be the main priority of any project.  Creating
something that WORKS should be the main priority.
<soap-box>
You need to make something that works and is easy to read.  After that,
if it runs to slowly, you can run it through a profiler and figure out
what parts of it are taking the most amount of time. You'd often be
surprised.

If you start with something that works and is easy to maintain, then
tweaking the speed will be a lot easier.  If you worry about it too
soon (its called premature optimization), then you might end up with a
very slow program anyway, but it would be harder to optimize the RIGHT
parts.
</soap-box>
In any case:
HashSets are often more effecient when adding and removing arbitrary
elements if you need to ensure uniqueness.  They are of comparable
speed when iterating through the contained values.
ArrayLists are generally faster than LinkedLists if you only add/remove
from the end of the list, but look up arbitrary elements in the middle
of the list.

If you need to maintain order, then look into LinkedHashSet.
While it has a little more overhead than HashSet, it can be faster when
iterating over the entire set.

So. My professional suggestion is to make all of your types "Set" (or
"Collection" is better, depending on what you need to do), and
instantiate with "new HashSet()".  If you find, once you finish the
working project, that the HashSet is too slow for one reason or
another, it will be easy to modify your code to use a different type of
Collection, better suited to your domain.
Lee Weiner - 20 Dec 2006 00:00 GMT
>> Is there any reason not to use a Set instead?
>
>Ye, I was just reading about Sets , there and the more I read the more
>inclined I am to use them. How are they as regards efficiency as
>opposed to linked lists? Efficiency is my main prioity for this project.

Efficiency measured how?  Do Sets use more resources than LinkedLists? Yes.  
The whole reason for a Set to exist is to eliminate duplicates.  That's what
they do, that's all they do. AFAIC, a Set is the most efficient tool for your
requirement.

Lee Weiner
lee AT leeweiner DOT org
Daniel Pitts - 20 Dec 2006 00:05 GMT
> >> Is there any reason not to use a Set instead?
> >
[quoted text clipped - 9 lines]
> Lee Weiner
> lee AT leeweiner DOT org
Actually, a set is likely more effecient in some ways.  Also, they are
for much more than just eliminating duplicates.

For example, they are also for effecient "contains" checks.
Damo - 20 Dec 2006 01:14 GMT
Thanks for the words of advice. I think i'll put some more thought into
which one I use as I need some of the advantages of several of them.
With Linked lists it takes O(1) to insert into a list at an arbirtrary
position, right?
My plan was to iterate through the lists , if the first list contains
the current item in the second list,skip it or else insert it into the
corrsponding position in the first list.
this should I think return one List with no duplicates
Patricia Shanahan - 20 Dec 2006 13:35 GMT
> Thanks for the words of advice. I think i'll put some more thought into
> which one I use as I need some of the advantages of several of them.
> With Linked lists it takes O(1) to insert into a list at an arbirtrary
> position, right?

If you do anything with a specified index, LinkedList scans from the
nearer end of the list to find the place. Finding the next element from
an iterator should be faster.

> My plan was to iterate through the lists , if the first list contains
> the current item in the second list,skip it or else insert it into the
> corrsponding position in the first list.
> this should I think return one List with no duplicates

That sounds like an O(n*n) method for merging two lists, each containing
n elements, even if you use iterators.

Sort-and-merge is O(n log n). You only have to look at the head elements
of each list, move the smaller to the result, dropping any duplicates
from either list.

Dumping all the data into a HashSet is O(n). It is also VERY simple to
code, using addAll.

Patricia
Oliver Wong - 20 Dec 2006 20:44 GMT
> Thanks for the words of advice. I think i'll put some more thought into
> which one I use as I need some of the advantages of several of them.
> With Linked lists it takes O(1) to insert into a list at an arbirtrary
> position, right?

   Are we talking about linked lists, or are we talking about Sun's class,
java.util.LinkedList? Inserting into java.util.LinkedList at an arbitrary
position is O(n), the implementation has to "seek" to that arbitrary
position. A different implementation of linked list might have different
asymptotic runtime performance.

> My plan was to iterate through the lists , if the first list contains
> the current item in the second list,skip it or else insert it into the
> corrsponding position in the first list.

   Given that you said the list are unsorted, how would you determine what
the "corresponding" position is?

> this should I think return one List with no duplicates

   Yes, but using a HashSet will probably be faster.

   - Oliver
Jhair Tocancipa Triana - 20 Dec 2006 15:13 GMT
> Hi,
> I (will) have anythng up to 6 Linked Lists of strings. I want to merge
> them and remove duplicate entries at the same time.

If you don't want duplicates, what about using Sets instead of a
Lists?

Signature

-- Jhair

Lew - 20 Dec 2006 15:58 GMT
Damo  writes:
>> I (will) have anythng up to 6 Linked Lists of strings. I want to merge
>> them and remove duplicate entries at the same time.

Daniel Dyer wrote:
> Is LinkedList the best data structure for your needs?  Perhaps a Set
> would be better if you need no duplicates and don't care about sorting?

Daniel Pitts wrote:
> Is there any reason not to use a Set instead?

Daniel Pitts wrote:
> So. My professional suggestion is to make all of your types "Set" (or
> "Collection" is better, depending on what you need to do), and
> instantiate with "new HashSet()".  

Lee Weiner wrote:
> The whole reason for a Set to exist is to eliminate duplicates.  That's what
> they do, that's all they do. AFAIC, a Set is the most efficient tool for
> your requirement.

Daniel Pitts wrote:
> Actually, a set is likely more efficient in some ways.  Also, they are
> for much more than just eliminating duplicates.
>
> For example, they are also for efficient "contains" checks.

Patricia Shanahan wrote:
>> My plan was to iterate through the lists , if the first list contains
>> the current item in the second list,skip it or else insert it into the
[quoted text clipped - 10 lines]
> Dumping all the data into a HashSet is O(n). It is also VERY simple to
> code, using addAll.

> If you don't want duplicates, what about using Sets instead of a
> Lists?

Get the hint??

- Lew
Remon van Vliet - 21 Dec 2006 13:48 GMT
> Get the hint??
>
> - Lew

Sarcasm is a wonderful thing ;)
Lew - 21 Dec 2006 15:09 GMT
>> Get the hint??
>>
>> - Lew
>
> Sarcasm is a wonderful thing ;)

Is it sarcasm? I thought it was emphasis that many folks kept repeating the
same advice over and over again, with varying degrees of supporting and
compelling evidence, and that the advice was not being absorbed.

If one person calls you a horse, they're a kook.
If two people call you a horse, it's a joke or a conspiracy.
If three people call you a horse, it's time to buy a saddle.

People kept responding to the OP with good advice. More people kept responding
with the same good advice. And still more people kept responding with still
the same good advice. People put themselves out in good faith trying to help
the OP; their efforts seemed to be in vain. There was no sarcasm in pointing
that out.

- Lew
John Ersatznom - 21 Dec 2006 15:42 GMT
> If one person calls you a horse, they're a kook.
> If two people call you a horse, it's a joke or a conspiracy.
> If three people call you a horse, it's time to buy a saddle.

No, if three people call you a horse, it's time to feed your killfile,
if need be first saying sayonara to google groups or anything of the
kind. Who has time for anyone that can't even correctly identify you to
species? :)
Daniel Pitts - 21 Dec 2006 16:31 GMT
> >> Get the hint??
> >>
[quoted text clipped - 17 lines]
>
> - Lew

I'm not so sure it was in vain...
Damo hasn't argued with the point ([s]he's no Twisted after all). There
just isn't an evidence that Damo has tried with the Set and succeeded
or otherwise.  Perhaps a follow up with be nice Damo?
Twisted - 21 Dec 2006 16:59 GMT
> Damo hasn't argued with the point [implied insult directed towards me deleted]

Oh, no, not this sh.t again.
Daniel Pitts - 21 Dec 2006 18:31 GMT
> > Damo hasn't argued with the point [implied insult directed towards me deleted]
>
> Oh, no, not this sh.t again.
You're right, that was uncalled for by me. I apologize.
bugbear - 21 Dec 2006 17:02 GMT
> Hi,
>
[quoted text clipped - 4 lines]
> go about doing this efficiently.
> Any advice would be much appreciated.

1) bear in mind what other have said about not
using lists at all.

2) If you don't mind some temporary storage,

LinkedList merge(List multipleLists) {
    LinkedList newList = new LinkedList();
    Set newListSet = new HashSet();

    for(Iterator li = multipleLists.iterator(); li.hasNext();) {
        List l = (List)li.next();
        for(Iterator i = l.iterator(); i.hasNext();) {
            Object o = i.next();
            if(!newListSet.contains(o)) {
                newListSet.add(o);
                newList.add(o);
            }
        }
    }
    return newList;
}

may serve (untested code)

I'm assuming your multiple lists are held in a list...
I think that's O(N).

It also kind of retains the order of the input lists

   BugBear
djthomp - 21 Dec 2006 19:54 GMT
> > Hi,
>
[quoted text clipped - 32 lines]
>
>     BugBear

It can be simpler if you take advantage of set's addAll method:

LinkedList merge(List multipleLists) {
   Set newListSet = new HashSet();

   for(Iterator li = multipleLists.iterator(); li.hasNext();)
       newListSet.addAll((List)li.next());

   return new LinkedList(newListSet);
}
Daniel Pitts - 21 Dec 2006 20:33 GMT
> > > Hi,
> >
[quoted text clipped - 43 lines]
>     return new LinkedList(newListSet);
> }

Or, using a more type safe, Java 1.5 approach:

public static <E> List<E> merge(
  Iterable<? extends Collection<? extends E>> collections) {
   final Set<E> set = new LinkedHashSet<E>();
   for (Collection<? extends E> collection: collections) {
     set.addAll(collection);
   }
   return new LinkedList<E>(set);
}
Damo - 21 Dec 2006 21:35 GMT
hello all,
I was'nt ignoring you're advice. I was just exploring all the options,
seeing as I need some of the advantages of Linked Lists and Sets. In
particular the no duplicate feature of sets and i needed to keep the
order items were added(not necessarlly sorted)(Linked Lists), and
inserting into a particulaar point in the colection(Linked Lists).
I have been trying to work with sets but they dont seem to suit my
needs. Its too convoluted to retain the ordering of the collection
Anyway you're words of advice are appreciated.
djthomp - 21 Dec 2006 21:50 GMT
> hello all,
> I was'nt ignoring you're advice. I was just exploring all the options,
[quoted text clipped - 5 lines]
> needs. Its too convoluted to retain the ordering of the collection
> Anyway you're words of advice are appreciated.

Before choosing not to use a Set because of order retention problems,
you might take at LinkedHashSet.  It keeps track of item order by using
saving the insertion order with an internal linked list.  Iterating
over the LinkedHashSet gives you its contents in their original order.

http://java.sun.com/j2se/1.5.0/docs/api/java/util/LinkedHashSet.html
djthomp - 21 Dec 2006 21:52 GMT
> > hello all,
> > I was'nt ignoring you're advice. I was just exploring all the options,
[quoted text clipped - 10 lines]
>
> http://java.sun.com/j2se/1.5.0/docs/api/java/util/LinkedHashSet.html

...you might take -a look- at LinkedHashSet.  Sigh.
Patricia Shanahan - 21 Dec 2006 23:19 GMT
>>> hello all,
>>> I was'nt ignoring you're advice. I was just exploring all the options,
[quoted text clipped - 12 lines]
>
> ...you might take -a look- at LinkedHashSet.  Sigh.

More generally, there are java.util.Set implementations that provide
each of the common forms of ordering, so I tend to think Set as soon as
I know I want to avoid duplicates, even if it needs to be ordered.

Don't care about order: HashSet

Insertion order: LinkedHashSet

Natural order: TreeSet

Comparator-specified order: TreeSet

Patricia
John Ersatznom - 22 Dec 2006 10:31 GMT
> Don't care about order: HashSet
>
[quoted text clipped - 5 lines]
>
> Patricia

Can any of these support inserting a new element into the middle of the
ordering? Well, the first can't, and the third and fourth can if you do
the "BASIC line numbering trick" and give every item a position number,
where you leave gaps so you can fill in missing positions later. That's
a bit of a hack (but it can be done, e.g. using doubles and suggesting
the client code insert baz between foo and bar by setting
(foo.getPosition() + bar.getPosition())/2 as the position for baz.

Of course, once you are comparing stuff using a position number field
you set to suit, you're looking at a comparator that isn't consistent
with equals, which is going to bomb your TreeSet anyway. Now you need
the objects to use their position number to decide equality (and hash
code) too. You probably actually want to wrap them with a

public class PositionalHolder<T>
        implements Comparable<PositionalHolder<T>> {
    public double position;
    public T object;
    public PositionalHolder (T object, double position) {
        this.object = object;
        this.position = position;
    }
    @SuppressWarnings("unchecked")
    public void equals (Object x) {
        return (x == this)
            ||(x instanceof PositionalHolder
            && ((PositionalHolder)x).position
            == position);
    }
    public int hashCode () {
        return Double.valueOf(position).hashCode();
    }
    public int compareTo (PositionalHolder<T> other) {
        return (position < other.position)
            ?-1:((position > other.position)?1:0);
    }
    public void putBetween (PositionalHolder<T> x,
            PositionalHolder<T> y) {
        // Convenience method.
        position = (x.position + y.position)/2;
    }
}

Of course, Bad Things will likely happen if you a) use infs or nans as
positions, b) use the same position for multiple objects in the same
ordered collection, or c) change something's position while it's in one
rather than remove it, change its position, and then reinsert it. You
might want to prevent the last by making the PositionalHolder immutable
with a (x, object, y) "between" constructor, a just-an-object
constructor (position gets zero), a "before" constructor (object, y)
that uses y.position - 1, and an "after" constructor (x, object) that
uses x.position + 1. (This overload is clearly ambiguous if you try to
put a PositionalHolder inside another PositionalHolder but why the hell
would you?)
John Ersatznom - 22 Dec 2006 12:48 GMT
> public class PositionalHolder<T>
>         implements Comparable<PositionalHolder<T>> {
[quoted text clipped - 24 lines]
>     }
> }

Naturally, it was right after posting this that I came up with the
ArbitraryHolder, which holds one object it "holds" and a second whose
equals(), hashCode(), and compareTo() get used. And can be constructed
with an explicit comparator to use. On either the "held" object or the
object used to define the "equals" and "hashCode" behavior...
Oliver Wong - 21 Dec 2006 21:51 GMT
> i needed to keep the
> order items were added(not necessarlly sorted)(Linked Lists)

   Maybe you can elaborate on this requirement to help us provide you with
better advice.

   - Oliver
bugbear - 22 Dec 2006 10:23 GMT
> Or, using a more type safe, Java 1.5 approach:
>
[quoted text clipped - 6 lines]
>     return new LinkedList<E>(set);
> }

Yeah - sorry, we're still using Java 1.4 here, for reasons
of install base compatibility.

  BugBear
Daniel Pitts - 23 Dec 2006 19:30 GMT
> > Or, using a more type safe, Java 1.5 approach:
> >
[quoted text clipped - 11 lines]
>
>    BugBear

I feel really sorry for you.  Java 1.5 is a vast improvement over 1.4.
In my opinion, the distance between 1.1 and 1.4 is the same as 1.4 to
1.5.

I find it a lot easier to upgrade an install base, than to maintain 1.4
code.

Daniel.
John Ersatznom - 24 Dec 2006 14:10 GMT
> I find it a lot easier to upgrade an install base, than to maintain 1.4
> code.

Agreed. All those casts! They make pre-1.5 Java the first runner up for
the Too Damn Many Parentheses Award. (We all know who the actual winner
is, I trust.)
Daniel Pitts - 24 Dec 2006 19:37 GMT
> > I find it a lot easier to upgrade an install base, than to maintain 1.4
> > code.
>
> Agreed. All those casts! They make pre-1.5 Java the first runner up for
> the Too Damn Many Parentheses Award. (We all know who the actual winner
> is, I trust.)

Whitespace's cousin Parentheses?


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.