Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / First Aid / September 2004

Tip: Looking for answers? Try searching our database.

Downloading HTML files from Server..

Thread view: 
Patrick - 16 Sep 2004 16:43 GMT
A website runs a Fantasy Football League for English soccer in the
England
They put up a list of the players and the points they have
I am running a fantasy football for just a few friends
I want to download the scores from this website
Then parse them
Then calculate the points for all the teams in our little Fantasy
Football league

Now to my problems...

If I navigate to

http://www.dreamteamfc.com/dtfc04/servlet/OpenFSELogin?homename=dtfc04&language=
ENGLISH


And click on a link which calls the follow javascript function

"javascript:dt_pop('PostPlayerList?catidx=1&title=GOALKEEPERS&gameid=167',
'remote', 610, 550, 10, 10, 'no', 'yes', 'no', 'no'); "

A new page pops up, the browser say its url is

http://www.dreamteamfc.com/dtfc04/servlet/PostPlayerList?catidx=1&title=GOALKEEP
ERS&gameid=167


Now with my code

public class DownloadWebPage
{   public static void main (String[] args) throws IOException
    {
         URL url = new  
URL("http://www.dreamteamfc.com/dtfc04/servlet/PostPlayerList?catidx=1");
         BufferedReader webRead = new BufferedReader(new
InputStreamReader(url.openStream()));
         String line;
         while ((line = webRead.readLine()) != null)
         {
             System.out.println(line);
         }
         
      }
  }

I can download the page at

http://www.dreamteamfc.com/dtfc04/servlet/OpenFSELogin?homename=dtfc04&language=
ENGLISH


But I cannot download the page at

http://www.dreamteamfc.com/dtfc04/servlet/PostPlayerList?catidx=1&title=GOALKEEP
ERS&gameid=167


That the javascript function

"javascript:dt_pop('PostPlayerList?catidx=1&title=GOALKEEPERS&gameid=167',
'remote', 610, 550, 10, 10, 'no', 'yes', 'no', 'no'); "

In the page

http://www.dreamteamfc.com/dtfc04/servlet/OpenFSELogin?homename=dtfc04&language=
ENGLISH


calls.

When I try to download

http://www.dreamteamfc.com/dtfc04/servlet/PostPlayerList?catidx=1&title=GOALKEEP
ERS&gameid=167


With my code, all i get is

<HTML><HEAD><SCRIPT
LANGUAGE="JAVASCRIPT">location.replace("http://www.dreamteamfc.com");</SCRIPT></HEAD></HTML>

I noticed that when I access

    http://www.dreamteamfc.com/dtfc04/servlet/OpenFSELogin?homename=dtfc04&language=
ENGLISH


The server sends a cookie. And then when I access

    http://www.dreamteamfc.com/dtfc04/servlet/PostPlayerList?catidx=1&title=GOALKEEP
ERS&gameid=167


I get the table of players and their respective points.

But when I try to access

    http://www.dreamteamfc.com/dtfc04/servlet/PostPlayerList?catidx=1&title=GOALKEEP
ERS&gameid=167


without accessing

    http://www.dreamteamfc.com/dtfc04/servlet/OpenFSELogin?homename=dtfc04&language=
ENGLISH


first, I just get redirected to

    http://www.dreamteamfc.com/dtfc04/servlet/OpenFSELogin?homename=dtfc04&language=
ENGLISH

   

I used the following code

    public static void main (String[] args) throws IOException
    {
           URL url = new
URL("http://www.dreamteamfc.com/dtfc04/servlet/PostPlayerList?catidx=1&title=GOALKEEP
ERS&gameid=167
");
           URLConnection uc = url.openConnection();                
           System.out.println(uc.getHeaderField("Set-Cookie"));
    }

To get the cookie, which was

    CF_HA=2415676698; Domain=.dreamteamfc.com; expires=Tue, 14-Sep-04
22:25:46 GMT; Path=/

I think
    CF_HA, is just a unique identifier, a variable which in incremented
by the server for each new client
    Domain, is just the domain
    expires, is just the expiry date
    Path, hmm dunno

Now I hardcoded the cookie into the code, with a valid expiry date
   public static void main (String[] args) throws IOException
   {
       URL url = new
URL("http://www.dreamteamfc.com/dtfc04/servlet/PostPlayerList?catidx=1&title=GOALKEEP
ERS&gameid=167
");
       URLConnection uc = url.openConnection();
       
       String cookie = "CF_HA=2415676698; Domain=.dreamteamfc.com;
expires=Tue, 14-Sep-04 22:25:46 GMT; Path=/";
       uc.setRequestProperty("cookie",cookie);
       int i = 0;
       
       while ((i = uc.getInputStream().read()) != -1)
       {   System.out.print((char) i);
       }                
   }

Now, when I run this code I get the following error

Exception in thread "main" java.io.IOException: Server returned HTTP
response code: 400 for URL:
http://www.dreamteamfc.com/dtfc04/servlet/PostPlayerList?catidx=1&title=GOALKEEP
ERS&gameid=167

    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1133)
    at Test.main(Test.java:46)

Am I sending the cookie correctly?

Is there  something else I must do?

Any help/advice appreciated,
regards
pat
Andrew  Chambers - 17 Sep 2004 15:47 GMT
Hi Patrick,

Whilst you can accomplish these kinds of task using Java, there are
other tools more specifically designed for the task.  Have you ever
heard of curl?

http://curl.haxx.se/

This tool grabs data over the internet using a variety of protocols
(http, ftp etc), and for the relatively simple problem it should be
ideal.

Andy


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.