Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / January 2007

Tip: Looking for answers? Try searching our database.

using sockets to open connection to a search engine

Thread view: 
Damo - 15 Jan 2007 22:47 GMT
Hi,
I'm trying to open a connection to altavista.com through java to
retrieve the search results for a query. This is the code I'm using, it
works for google and yahoo but not altavista or MSN.

s = new Socket("altavista.com",80);
p = new PrintStream(s.getOutputStream());
p.print("GET /web/results?q=java HTTP/1.0\r\n");
p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
p.print("Connection: close\r\n\r\n");
in = s.getInputStream();

If you type this : www.altavista.com/web/results?q=java           into
the address bar, it will return the result page.

Can anyone help me
Thanks
Arne Vajhøj - 15 Jan 2007 22:58 GMT
> Hi,
> I'm trying to open a connection to altavista.com through java to
[quoted text clipped - 11 lines]
> If you type this : www.altavista.com/web/results?q=java           into
> the address bar, it will return the result page.

Put something in between your browser and AltaVista and
see what the browser sends.

You already have User-Agent, but maybe it wants Referrer or
Accept or Accept-Language or Accept-Encoding.

Or maybe it wants HTTP/1.1 (which requires Host).

There is a limited number of things to add until
you are fully browser compatible.

Arne
Damo - 15 Jan 2007 23:04 GMT
sorry, I meant to say the error was a 404  , resource not found on this
server.
so its connecting but not returning the results
Tom Hawtin - 15 Jan 2007 23:25 GMT
> If you type this : www.altavista.com/web/results?q=java           into
> the address bar, it will return the result page.

This seems to work (once I managed to spell alta-vista with both Ts -
shouldn't have repeated myself):

import java.io.*;
import java.net.*;

class Search {
    public static void main(String[] args) throws Exception {
        Socket s = new Socket("www.altavista.com",80);
        String request =
"GET /web/results?q=java HTTP/1.1\r\n"+
"Host: www.altavista.com:80\r\n"+
"User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;rv:1.8.1)
Gecko/20061010 Firefox/2.\r\n"+
"Connection: close\r\n\r\n";
        OutputStream out = s.getOutputStream();
        out.write(request.getBytes());
        out.flush();
        InputStream in = s.getInputStream();
        for (;;) {
            int b = in.read();
            if (b == -1) { break; }
            System.out.print((char)b);
        }
    }
}

Tom Hawtin
Damo - 15 Jan 2007 23:39 GMT
excellent, cheers, that did the trick
Martin Gregorie - 16 Jan 2007 00:12 GMT
> Hi,
> I'm trying to open a connection to altavista.com through java to
[quoted text clipped - 14 lines]
> Can anyone help me
> Thanks

Try opening the socket to "www.altavista.com"

Its not the same host as  "altavista.com". You can see the difference by
pinging them both and looking at the IPs and true host names.

Signature

martin@   | Martin Gregorie
gregorie. | Essex, UK
org       |



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.