Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / August 2007

Tip: Looking for answers? Try searching our database.

Downloading a file in Linux

Thread view: 
Grzesiek - 19 Aug 2007 20:19 GMT
Hi,

I use the following function to download a jar file from my website:

public synchronized boolean copyFileFromWeb(){

     try
     {
             URL url  = new URL(sourceURL);
             URLConnection urlC = url.openConnection();
             InputStream is = url.openStream();
             System.out.print("Copying resource (type: " +
urlC.getContentType());
             Date date=new Date(urlC.getLastModified());
             System.out.flush();
             FileOutputStream fos=null;
             fos = new FileOutputStream(destinationPath);
             int oneChar, count=0;
             while ((oneChar=is.read()) != -1)
             {
                fos.write(oneChar);
                count++;
             }
             is.close();
             fos.close();
             System.out.println(count + " byte(s) copied");
             return true;
     }
     catch (Exception e){
         System.err.println(e.toString());
     }
     return false;

}

In Windows XP it works perfectly, but in Linux it works very slow and
the downloaded file is corrupted! What is wrong?
Grzesiek - 19 Aug 2007 20:49 GMT
> Hi,
>
[quoted text clipped - 33 lines]
> In Windows XP it works perfectly, but in Linux it works very slow and
> the downloaded file is corrupted! What is wrong?

I wonder wheather HTTP Proxy Server is involved in it. I know that
someone puts in code something like this:

  System.setProperty("http.proxyHost","xyz.com");
  System.setProperty("http.proxyPort", 8080);

Is it the case?

I found a link about downloading a file in Linux

http://linux.sys-con.com/read/39248.htm
Daniel Pitts - 19 Aug 2007 21:05 GMT
> Hi,
>
[quoted text clipped - 33 lines]
> In Windows XP it works perfectly, but in Linux it works very slow and
> the downloaded file is corrupted! What is wrong?

It shouldn't be any different on Linux, unless there is something else
fundamentally different about your set up.

Is it the same machine, dual booted into one OS or the other?  Is it
two similar machines on the same network subnet? Is it two very
different machines, or on different networks?   There are a lot of
possibilities here.

One thing I would suggest, regardless of your machines, is that you
read into a byte[] (at least 1024 bytes, if not larger, probably
between 16k and 256k) instead of one byte at a time.  It is extremely
inefficient to read/write one byte at a time.

BTW, instead of System.setProperty, you can use -
Dhttp.proxyHost=xyz.com on the command line before the class name when
you execute your program, but I really don't think its proxy, I think
its the byte-at-a-time reading.
Grzesiek - 19 Aug 2007 21:47 GMT
> > Hi,
>
[quoted text clipped - 53 lines]
>
> - Poka  cytowany tekst -

Hi Daniel,

It is two similar machines on the same local network. But i tried to
run the program on completly difrent machine in another network and it
didnt work either. The program didnt work on  Windows 2000
either. But im not sure wheather the copyfromWeb() function was the
only problem that time.

I read one byte at a time because i download a JAR FILE not an image.
No corrupted bytes are allowed here. In fact i tried reading into
byte[1024] and byte[4096] but then downloaded file is 140kB and 160kB
instead of 116kB- which is the size of the file i want to downlaod. To
large file is corrupted and cannot be run.

According to the link:

http://linux.sys-con.com/read/39248.htm

i changed the function like this:

public synchronized boolean copyFileFromWeb2(){
       try{
          URL url  = new URL(newConfigProgURL);
             URLConnection urlC = url.openConnection();
             InputStream is = url.openStream();
             System.out.print("Copying resource (type: " +
urlC.getContentType());
             Date date=new Date(urlC.getLastModified());
             System.out.flush();
             FileOutputStream fos=null;
             fos = new FileOutputStream(tempConfigProgPath);
             DataOutputStream out=new DataOutputStream(fos);
             DataInputStream in=new
DataInputStream(urlC.getInputStream());

             int oneChar, count=0;
             while ((oneChar=in.read()) != -1)
             {
                fos.write(oneChar);
                count++;
             }
             is.close();
             fos.close();
             System.out.println(count + " byte(s) copied");
             return true;
       }catch(Exception e){
           System.err.println(e);
       }
       return false;

   }

Now it works! Still downloading 116 kB jar file in Linux takes about
30 secunds while in Windows Xp it takes maybe 1 secund.
Arne Vajhøj - 19 Aug 2007 22:14 GMT
> I read one byte at a time because i download a JAR FILE not an image.
> No corrupted bytes are allowed here. In fact i tried reading into
> byte[1024] and byte[4096] but then downloaded file is 140kB and 160kB
> instead of 116kB- which is the size of the file i want to downlaod. To
> large file is corrupted and cannot be run.

You can get any file by reading with large buffers - it only
affects performance not functionality.

Code snippet:

            URL url = new URL(urlstr);
            HttpURLConnection con =
(HttpURLConnection)url.openConnection();
            con.connect();
            if(con.getResponseCode() == HttpURLConnection.HTTP_OK) {
               InputStream is = con.getInputStream();
               OutputStream os = new FileOutputStream(fnm);
               byte[] b = new byte[100000];
               int n;
               while((n = is.read(b)) >= 0) {
                  os.write(b,0,n);
               }
               os.close();
               is.close();
            }
            con.disconnect();

Arne
Grzesiek - 19 Aug 2007 22:55 GMT
On 19 Sie, 23:14, Arne Vajh?j <a...@vajhoej.dk> wrote:
> > I read one byte at a time because i download a JAR FILE not an image.
> > No corrupted bytes are allowed here. In fact i tried reading into
[quoted text clipped - 25 lines]
>
> Arne

Thanx Arne,

i used your snippet and now my function works fine :-) There is no
diffrence between Linux and Windows Xp now. So reading one byte at a
time was the problem.

Thanx all :-)
Daniel Pitts - 20 Aug 2007 15:43 GMT
> On 19 Sie, 23:14, Arne Vajh?j <a...@vajhoej.dk> wrote:
>
[quoted text clipped - 35 lines]
>
> Thanx all :-)

Glad that worked for you.  Something else I forgot to mention was that
reading one Character at a time is VERY different from reading one
Byte at a time.  There are some conversions that Java does, which
would explain your corrupt data.  Unlike C/C++, Char are 2 bytes, and
they are usually encoded/decoded when written to/read from streams, so
you end up with unexpected values if you're trying to read non-
character data.
Thomas Hawtin - 19 Aug 2007 21:52 GMT
> Hi,
>
[quoted text clipped - 13 lines]
>               FileOutputStream fos=null;
>               fos = new FileOutputStream(destinationPath);

Why assign to null and then assign a proper value the statement after?

>               int oneChar, count=0;
>               while ((oneChar=is.read()) != -1)
>               {

Copying one character is liable to be relatively slow. At least copy
through a byte array.

>                  fos.write(oneChar);
>                  count++;
>               }
>               is.close();
>               fos.close();

These should each be in a finally block of a try-finally.

>               System.out.println(count + " byte(s) copied");
>               return true;
>       }
>       catch (Exception e){
>           System.err.println(e.toString());
>       }

It's not a great idea to catch Exception rather than the actual
exception type you wish to catch.

>       return false;
>
> }
>
> In Windows XP it works perfectly, but in Linux it works very slow and
> the downloaded file is corrupted! What is wrong?

When you say slowly, is it the first byte which is slow or each
subsequent byte. If it is only up to the first byte, then on obvious
suspect is DNS misconfiguration (which happens more often on Windows).

When you say the file is corrupt, what do you actually get? Truncated?
Complete rubbish? Some bytes wrong? Something else?

You might want to try nc to see what the web server is actually doing.

Tom Hawtin
Grzesiek - 19 Aug 2007 22:02 GMT
Hi Tom
> When you say the file is corrupt, what do you actually get? Truncated?
> Complete rubbish? Some bytes wrong? Something else?

When i used my first copyFromWeb() function and one byte at a time
reading i got truncated file. It was 73kB instead of 116kB and the
error was Socekt Error: connection reset. So i dont think its DNS's
error.
Each time connection was reset at 73 kB !

But i updated copyFromWeb() and i wonder why now it works. And i still
wonder why it is by far slower on Linux then Windows XP.
Lothar Kimmeringer - 20 Aug 2007 09:54 GMT
> When i used my first copyFromWeb() function and one byte at a time
> reading i got truncated file. It was 73kB instead of 116kB and the
> error was Socekt Error: connection reset. So i dont think its DNS's
> error.
> Each time connection was reset at 73 kB !

Connection reset means, that the connection has been closed by
the partner (or an intermediate proxy). Most operating systems
but also the underlying framework you're using (HttpUrlConnection
in your case) do some buffering as well, so what happens here is
that the connection is reading in 73 kB into a buffer that you
read byte by byte.

It seems that the unbuffered write to the filesystem is taking
much longer on Linux than on Windows. Writing to a filesystem is
OS-dependent and - in case of Linux - also depends on the type
of filesystem as well. If it's a network-based (NFS, SMB, XFS, ...)
writing one single byte (incl. sync etc) might take some time.

Because it takes so long and you're acutally reading from a local
buffer rather than the network-connection the server gets bored
and closes the connection due to a timeout being reached. The
moment your internal buffer is empty and the connection is trying
to fetch the next bunch of data it's running against the wall
(connection reset).

> But i updated copyFromWeb() and i wonder why now it works. And i still
> wonder why it is by far slower on Linux then Windows XP.

Writing and reading blockwise (with an array) is much mure efficient
than single bytes (that can be regarded as blocks with length 1),
because that's the way file- and network-operations are designed
to be. Because the internal buffer of the connection is now empties
much more faster, the timeout on the server-side doesn't happen,
therefore you receive the whole bunch of data.

Regards, Lothar
Signature

Lothar Kimmeringer                E-Mail: spamfang@kimmeringer.de
              PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)

Always remember: The answer is forty-two, there can only be wrong
                questions!



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.