Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / February 2007

Tip: Looking for answers? Try searching our database.

Advice/Help with Multithreading

Thread view: 
DyslexicAnaboko - 17 Jan 2007 01:03 GMT
I wrote a method that will take a URL, and return its page in String
form.

Now depending on which webpage is being visited is how long it will
take to download its contents. There is a difference between getting
the contents of google vs. yahoo, obviously the page sizes differ.

Since I would have many pages to download, downloading them 1 at a time
takes forever. I just want to speed things up. I figured that
multithreading would be my answer since I could create several threads
to download pages simultaneously. I am inexperienced with
multithreading though, so I was just hoping that anyone could give me
some pointers or advice on where to begin.

Basically I want to do the following:

1. I want to create X threads, lets just say 10 for arguments sake.

2. I want each thread to get its own assigned URL. Will there be a
problem with more than one thread accessing the same method?

3. After downloading the contents of the page I intend to put the
strings into a list. Will there be a problem with more than one thread
accessing the same object? If so, should I use semaphores?

I'm not asking anyone to write this for me, I just don't know where to
begin. If anyone can spare an example or any advice I am all ears.

Thanks,

Eli
Knute Johnson - 17 Jan 2007 02:37 GMT
> I wrote a method that will take a URL, and return its page in String
> form.
[quoted text clipped - 27 lines]
>
> Eli

You can run the same method in multiple threads.  Assuming that you
synchronize access to any variables that are accessed by multiple
threads.  So if you write a method, getString(URL url) you can then
create a thread to run that method in as follows:

Runnable r = new Runnable() {
    public void run() {
        getString(url);
    }
};
new Thread(r).start();

You will need some code after the call to getString() to put it
somewhere but that is really all there is to it.

Start writing the program and post your progress.

Signature

Knute Johnson
email s/nospam/knute/

DyslexicAnaboko - 17 Jan 2007 20:29 GMT
Will do, thank you that was very helpful, that is exactly what I needed
to get me started.

Eli

> > I wrote a method that will take a URL, and return its page in String
> > form.
[quoted text clipped - 44 lines]
>
> Start writing the program and post your progress.
Daniel Pitts - 17 Jan 2007 21:01 GMT
> I wrote a method that will take a URL, and return its page in String
> form.
[quoted text clipped - 27 lines]
>
> Eli

Look at the java.util.concurrent package, it has helpful classes for
almost everything you're asking about.
<http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/package-summary.html>

Specifically ThreadPoolExecutor, and BlockingQueue.

You can submit download requests to the executor, and have them stuff
the results into the blocking queue.  You would have one or more
seperate thread reading from the blocking queue and processing the
results.  If you want all the results to end up in one List, then you
either need to syncronize on that list, or have only one thread reading
from the BlockingQueue and writing to the list.

If you are writing a Spider (or Robot, or whatever)... Be sure to
follow good netiquette and respect robots.txt
<http://www.robotstxt.org/>
DyslexicAnaboko - 18 Jan 2007 20:55 GMT
I never thought of my program as a robot, but I guess it could be
called that, never thought about it that way before.

I was also worried about servers thinking that I may be attacking them
(DOS attacks), not my intentions at all.
I will look through that link you provided, it never even crossed my
mind, thanks for the heads up.

I am collecting anonymous information about random people on MySpace
and my friend is using the information for statistics. Everything is
nameless and faceless, we are using peoples MySpace ID's only. It is
really neat stuff. That is why I am trying to speed up the program
because it is really painful to sit and wait for one page to be
downloaded at a time, especially when you are waiting on a sample of
10,000 people or more. There are +/- 149,142,765 accounts.

I will try working with the concurrent class as suggested.

Thank you,

Eli
DyslexicAnaboko - 16 Feb 2007 01:39 GMT
I wanted to apologize for not doing a follow up post. The semester
started for me and I couldn't even think about the program after that.
I did however purchase a book on java threads. Thanks again to
everyone for their help.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.