Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / First Aid / December 2005

Tip: Looking for answers? Try searching our database.

URL NoFileFound Issue jre1.4.2

Thread view: 
BigMac - 06 Dec 2005 01:47 GMT
When considering the java class URLTestClass (below), if I use the
"urlGood" string as the URL param, I get a good html response which I
can read.  If I send the "urlBad" string, I get the exception.
However, if I take the string indicated as urlBad, and I cut/paste it
into Firefox or IE's address line, and press return.. walla, a good
response.  Is there something really stupid I'm doing?  I can only
imagine that I have some special characters that I'm not escaping or
something... I'm so frustrated and would welcome some belittling and
smug comments on my weak programmig skills... just so I could get an
answer. ...

java.io.FileNotFoundException:
http://www.google.com/search?hl=en&q=helpme&btnG=Google+Search
    at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:595)
    at test.URLTestClass.main(URLTestClass.java:34)

package test;

import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;

public class URLTestClass {

    /**
    * Constructor for URLTestClass.
    */
    public URLTestClass() {

    }

    public static void main(String args[]) {
        String urlGood = "http://www.google.com";
        String urlBad =
"http://www.google.com/search?hl=en&q=helpme&btnG=Google+Search";

        try {
            URL url = new URL(urlBad);
            InputStream is = url.openConnection().getInputStream();
            java.io.BufferedReader br = new java.io.BufferedReader(new
InputStreamReader(is));
            for (String l = null;(l = br.readLine()) != null;)
                System.out.println(l);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

j
bherbst65@hotmail.com - 06 Dec 2005 15:29 GMT
Hi BigMac,

I am no guru at this situation.
I have tested the complete urlBad using IE and it does work as you have
noted.

So to further test your work above, I have placed parts of the urlBad
in your program and note the response. I can get as far as
"http://www.google.com/search?hl=en&q="
and that does work in your program to download the web page as text as
you would see  "urlGood" .

But I do think that the reason it cannot go further is that final step
to look at the word placed in the search box "help me" is after that
initial step.

I have also tested this urlBad as a placement within a java applet that
will pull the web page, not as text,  but as a web page. I suppose that
since an applet calls the web page to start, it is logical that it will
allow your stated "badURL"  do its job of getting to the google  "help
me" page.

Sorry that I can't get it to run as you intended

Bob
BigMac - 07 Dec 2005 03:10 GMT
This seems like such a trivial issue, yet no one posted an answer.  I
guess I never thought I'd stump this group... I'm still in desperate
need to figure this out.  Any more help?  I am willing to bribe :)
Steve - 08 Dec 2005 02:13 GMT
> This seems like such a trivial issue, yet no one posted an answer.  I
> guess I never thought I'd stump this group... I'm still in desperate
> need to figure this out.  Any more help?  I am willing to bribe :)

Looks like Google knows you are using java and rejects the request (HTTP
error 403 with java 1.5). However, you can set the User-Agent request
header to trick it into thinking you are using a browser...

package test;

import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;

public class URLTestClass {

    /**
    * Constructor for URLTestClass.
    */
    public URLTestClass() {

    }

    public static void main(String args[]) {
        String urlGood = "http://www.google.com";
        String urlBad =
"http://www.google.com/search?hl=en&q=helpme&btnG=Google+Search";

        try {
            URL url = new URL(urlBad);
            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
            conn.setDoInput(true);
            conn.setDoOutput(false);
            conn.setRequestMethod("GET");
            conn.setRequestProperty(
                "User-Agent",
                "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR
1.0.3705; .NET CLR 1.1.4322)");
            conn.connect();
           
            InputStream is = conn.getInputStream();
            java.io.BufferedReader br = new java.io.BufferedReader(
                    new InputStreamReader(is));
            for (String l = null; (l = br.readLine()) != null;)
                System.out.println(l);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
Andrew Thompson - 08 Dec 2005 02:20 GMT
> This seems like such a trivial issue, yet no one posted an answer.

Not in the entire 26 hours since you asked, no..

> ... I
> guess I never thought I'd stump this group... I'm still in desperate
> need to figure this out.  Any more help?  I am willing to bribe :)

Are you willing to be patient?  This is usenet after all,
and you should be willing to wait at least 48 hours before
considering giving the group a 'hurry up' type message.

In any case, since it seems no-one else has mentioned it..

Do this test first with a known site of your own.
Google has been known to add some little quirks to it's
responses to prevent Java programs accessing it directly.

[ And the questions you should be asking are
*Why* do Google do such things?
What does that imply, or mean? ]

Signature

Andrew Thompson
physci, javasaver, 1point1c, lensescapes - athompson.info/andrew

BigMac - 08 Dec 2005 15:01 GMT
I just re-read my ealier message... I definately didn't mean to sound
the way I think it sounded.  It was a poor attempt at humor.  I
appreciate all comments... and didn't mean to imply otherwise.

The problem I posted occurs with other sites as well... I have the same
issue happening with whitepages.com... here is the url that fails via
URL but suceeds via address bar:

http://www.whitepages.com/10001/search/Find_Person?firstname_begins_with=1&first
name=John&name=Smith&city_zip=Roseville&state_id=MI&metro_area=1&printer_friendl
y=1


This just seems so "bothersome" to me because it sorta goes against
everything I think that "should" work... if it works in an address bar
I assume it should work as a URL.

I'm going to try to get some sort of sniffing utility to see exactly
what is being sent to google/whitepages when I use the address line and
when I use URL/java and see if there is something different.  Any other
comments are really, really welcome.

> > This seems like such a trivial issue, yet no one posted an answer.
>
[quoted text clipped - 21 lines]
> Andrew Thompson
> physci, javasaver, 1point1c, lensescapes - athompson.info/andrew
Andrew Thompson - 09 Dec 2005 09:26 GMT
> I just re-read my ealier message... I definately didn't mean to sound
> the way I think it sounded.  It was a poor attempt at humor.  I
> appreciate all comments... and didn't mean to imply otherwise.

Cool.  (next we'll deal with the top-posting, but, in the meantime..)

> The problem I posted occurs with other sites as well... I have the same
> issue happening with whitepages.com... here is the url that fails via
> URL but suceeds via address bar:
>
> http://www.whitepages.com/10001/search/Find_Person?firstname_begins_with=1&first
name=John&name=Smith&city_zip=Roseville&state_id=MI&metro_area=1&printer_friendl
y=1

Try this as a main()..

  public static void main(String[] args) throws Exception {
    URL url = new URL("http://www.whitepages.com/" +
      "10001/search/Find_Person?firstname_begins_with" +
      "=1&firstname=John&name=Smith&city_zip=Roseville" +
      "&state_id=MI&metro_area=1&printer_friendly=1");
    try {
      URLConnection urlc = url.openConnection();
      InputStream is = urlc.getInputStream();
      System.out.println( "connection opened");
    } catch (Exception e) {
      e.printStackTrace();
    }
  }

> This just seems so "bothersome" ..

Yes.  I gave you a clue to the cause in my first post, and
steve gave you code that (I guess) will 'fix' it, but you do
not seem to be paying close attention.

Signature

Andrew Thompson
physci, javasaver, 1point1c, lensescapes - athompson.info/andrew

Nigel Wade - 09 Dec 2005 14:40 GMT
> I just re-read my ealier message... I definately didn't mean to sound
> the way I think it sounded.  It was a poor attempt at humor.  I
[quoted text clipped - 3 lines]
> issue happening with whitepages.com... here is the url that fails via
> URL but suceeds via address bar:

http://www.whitepages.com/10001/search/Find_Person?firstname_begins_with=1&first
name=John&name=Smith&city_zip=Roseville&state_id=MI&metro_area=1&printer_friendl
y=1


> This just seems so "bothersome" to me because it sorta goes against
> everything I think that "should" work... if it works in an address bar
> I assume it should work as a URL.

You maybe heading up against a blind alley here.

Many search engines go to great lengths to prevent you from doing automated
searches. The idea is to stop you from putting up your own "search engine"
which does nothing other than pass off requests to their site while bypassing
all their adverts.

Signature

Nigel Wade, System Administrator, Space Plasma Physics Group,
           University of Leicester, Leicester, LE1 7RH, UK
E-mail :    nmw@ion.le.ac.uk
Phone :     +44 (0)116 2523548, Fax : +44 (0)116 2523555

Roedy Green - 07 Dec 2005 09:08 GMT
>http://www.google.com/search?hl=en&q=helpme&btnG=Google+Search

try URLEncoding that.  See http://mindprod.com/jgloss/urlencoded.html

Look on the command line of your browser or history to see how your
browser sends it to the server, or use a packet sniffer.

see http://mindprod.com/jgloss/sniffer.html

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.