Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / July 2007

Tip: Looking for answers? Try searching our database.

java.io.File to java.lang.String

Thread view: 
Benjamin - 24 May 2007 03:20 GMT
What's the best way to get the contents of a file represented by a
java.io.File object into a String?
Knute Johnson - 24 May 2007 03:56 GMT
> What's the best way to get the contents of a file represented by a
> java.io.File object into a String?

You don't specify what you consider best so how about simple as best?
Now for my curiosity, why would you want to do this?

import java.io.*;

public class test {
    public static void main(String[] args) throws Exception {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        File f = new File(args[0]);
        FileInputStream fis = new FileInputStream(f);
        int n;
        while ((n = fis.read()) != -1)
            baos.write(n);
        fis.close();
        String str = baos.toString();
        System.out.println(str);
    }
}

Signature

Knute Johnson
email s/nospam/knute/

Tom Hawtin - 24 May 2007 15:03 GMT
>> What's the best way to get the contents of a file represented by a
>> java.io.File object into a String?

> import java.io.*;
>
[quoted text clipped - 3 lines]
>         File f = new File(args[0]);
>         FileInputStream fis = new FileInputStream(f);

Going for a FileReader would probably be better.

The next few lines should be wrapped a try/finally.

>         int n;
>         while ((n = fis.read()) != -1)

One byte at a time. Not going to be fast.

>             baos.write(n);
>         fis.close();
>         String str = baos.toString();

I don't believe that will do anything useful.

>         System.out.println(str);
>     }
> }
Jeff Higgins - 24 May 2007 18:01 GMT
> One byte at a time. Not going to be fast.

Hi Tom,

Your comment prompted me to look for ways to do
block(bulk) read operations on text files.

One way I've come up with is below.
Will you comment on this, and can you suggest altenatives?

Thanks,
Jeff Higgins

import java.io.*;
import java.nio.CharBuffer;

public class TestBlockRead {
 public static void main(String[] args)
 {
   try
   {
     File file = new File("file.9612544.bytes");
     FileReader fileReader = new FileReader(file);
     CharBuffer charBuffer = CharBuffer.allocate((int)file.length());
     fileReader.read(charBuffer);
   }
   catch (FileNotFoundException e)
   {
     e.printStackTrace();
   }
   catch (IOException e)
   {
     e.printStackTrace();
} } }
Tom Hawtin - 24 May 2007 18:34 GMT
>       File file = new File("file.9612544.bytes");

Still need try-finally.

>       FileReader fileReader = new FileReader(file);
>       CharBuffer charBuffer = CharBuffer.allocate((int)file.length());

This could allocate a buffer three times to large, or way too small for
a huge file. allocateDirect may be a win if it were reused as a
temporary buffer (but I bet the implementation messes up somewhere).

>       fileReader.read(charBuffer);

This does not necessarily read all that could be read. Should be in a loop.

Tom Hawtin
Jeff Higgins - 25 May 2007 02:24 GMT
>>       File file = new File("file.9612544.bytes");
>
> Still need try-finally.

Yes, thanks.

>>       FileReader fileReader = new FileReader(file);
>>       CharBuffer charBuffer = CharBuffer.allocate((int)file.length());
>
> This could allocate a buffer three times to large,

Going over Javadocs... could you elaborate?

>or way too small for a huge file.

OK, yes.

allocateDirect may be a win if it were reused as a
> temporary buffer (but I bet the implementation messes up somewhere).

Skipping over this for the time being.

>>       fileReader.read(charBuffer);
>
> This does not necessarily read all that could be read. Should be in a
> loop.

Again, I'm sorry but I haven't been able to figure out what might
cause read(charBuffer) to not read all that could be read?

Is this a sufficent loop?
while(fileReader.ready()){fileReader.read(charBuffer);}

Appreciate your help.

Thanks,

Jeff Higgins
Lew - 25 May 2007 04:57 GMT
Tom Hawtin wrote:
>>>       FileReader fileReader = new FileReader(file);
>>>       CharBuffer charBuffer = CharBuffer.allocate((int)file.length());
>> This could allocate a buffer three times to large,

> Going over Javadocs... could you elaborate?

Because Strings and Chars are encoded, as are files.  UTF-8, for example, uses
one to three bytes per character depending on the character set and other
factors.

I'm not sure about how Tom arrived at three times as large but I can easily
see how the CharBuffer could be twice as large as the file data. CharBuffers
are allocated at two bytes per character.  A file encoding that uses 8 bits
per character will only fill half such a buffer.  I'm guessing that Tom is
familiar with some combination of encoding schemes that would have the
CharBuffer wind up three times too large for the file.

>> or way too small for a huge file.

If the file uses a multibyte encoding with lots of characters that require
more than two bytes each.

>>>       fileReader.read(charBuffer);
>> This does not necessarily read all that could be read. Should be in a
[quoted text clipped - 5 lines]
> Is this a sufficent loop?
> while(fileReader.ready()){fileReader.read(charBuffer);}

No.  You'll have to fill the buffer, flip() it, read it to store or processe
the data, then rewind() and repeat.  I haven't played with java.nio much but
if I erred here someone should step up and correct me pretty quickly.

<http://java.sun.com/developer/technicalArticles/releases/nio/index.html>
<http://www.javaworld.com/javaworld/jw-09-2001/jw-0907-merlin.html>

GIYF.

Signature

Lew

Jeff Higgins - 25 May 2007 15:06 GMT
> Tom Hawtin wrote:
>>>>       FileReader fileReader = new FileReader(file);
[quoted text clipped - 4 lines]
>
> Because Strings and Chars are encoded, as are files. ...

OK, chars are not bytes. (int)file.length() not a good choice here.

>>> or way too small for a huge file.

if file.length() > Integer.MAX_VALUE file == huge file

>>>>       fileReader.read(charBuffer);
>>> This does not necessarily read all that could be read. Should be in a
[quoted text clipped - 10 lines]
> java.nio much but if I erred here someone should step up and correct me
> pretty quickly.

Going back over Javadocs -- silly condition.

> <http://java.sun.com/developer/technicalArticles/releases/nio/index.html>
> <http://www.javaworld.com/javaworld/jw-09-2001/jw-0907-merlin.html>

Thanks for the pointers. I read the javaworld article, very interesting.

> GIYF.

GIGR The Google isa great resource.

Back to the OP which caught my eye, and to Tom's response,
"One byte at a time. Not going to be fast."

OK, scratch the CharBuffer solution. Now my latest solution:
[snippet]

startBlock = System.currentTimeMillis();
for(int i = 0; i < 10; i++)
 {
   File file = new File("file.9612544.bytes");
   byte[] a = new byte[(int)file.length()];
   FileInputStream fis = new FileInputStream(file);
   fis.read(a);
   String str = new String(a,"US-ASCII");
   fis.close();
 }
endBlock = System.currentTimeMillis();
startLoop = System.currentTimeMillis();
for(int i = 0; i < 10; i++)
 {
   File file = new File("file.9612544.bytes");
   byte[] a = new byte[(int)file.length()];
   FileInputStream fis = new FileInputStream(file);
   int n;
   int c = 0;
   while ((n = fis.read()) != -1)
   {
     a[0] = (byte)n;
   }
   String str = new String(a,"US-ASCII");
   fis.close();
 }
endLoop = System.currentTimeMillis();

Block 1547
Loop  287750

Thanks,
appreciate the OP
and all the comments.
Jeff Higgins
Knute Johnson - 27 May 2007 03:33 GMT
>> Tom Hawtin wrote:
>>>>>       FileReader fileReader = new FileReader(file);
[quoted text clipped - 45 lines]
>     FileInputStream fis = new FileInputStream(file);
>     fis.read(a);

This may or may not read as many bytes as the length of the array a and
is therefore guaranteed not to work every time.  See the docs.

>     String str = new String(a,"US-ASCII");
>     fis.close();
[quoted text clipped - 11 lines]
>     {
>       a[0] = (byte)n;

a[c++] = (byte)n;

>     }
>     String str = new String(a,"US-ASCII");
[quoted text clipped - 9 lines]
> and all the comments.
> Jeff Higgins

Signature

Knute Johnson
email s/nospam/knute/

Arne Vajhøj - 27 May 2007 03:50 GMT
>>     File file = new File("file.9612544.bytes");
>>     byte[] a = new byte[(int)file.length()];
[quoted text clipped - 3 lines]
> This may or may not read as many bytes as the length of the array a and
> is therefore guaranteed not to work every time.  See the docs.

s/guaranteed not/not guaranteed/w

Arne
Jeff Higgins - 04 Jul 2007 04:48 GMT
>>> Again, I'm sorry but I haven't been able to figure out what might
>>> cause read(charBuffer) to not read all that could be read?
[quoted text clipped - 15 lines]
>>
>> GIYF.

<http://mindprod.com:80/jgloss/readeverything.html>
Esmond Pitt - 04 Jul 2007 05:54 GMT
> Again, I'm sorry but I haven't been able to figure out what might
> cause read(charBuffer) to not read all that could be read?

The fact that the Javadoc specifically says so?
Jeff Higgins - 04 Jul 2007 14:14 GMT
>> Again, I'm sorry but I haven't been able to figure out what might
>> cause read(charBuffer) to not read all that could be read?
>
> The fact that the Javadoc specifically says so?

:-)  Yup, it is what it is.
Better for me to focus on what rather than why.
Patricia Shanahan - 04 Jul 2007 16:54 GMT
>>> Again, I'm sorry but I haven't been able to figure out what might
>>> cause read(charBuffer) to not read all that could be read?
>> The fact that the Javadoc specifically says so?
>
> :-)  Yup, it is what it is.
> Better for me to focus on what rather than why.

I think the "why" is because part of the file may be buffered in memory.
Disk reads are always in fixed block sizes, and the data required to
fill the program buffer may cross block boundaries.

Suppose some, but not all, of the data for the read call is already in
memory. The system could make you wait many milliseconds for a physical
read to let it fill your buffer. It is often more efficient to let you
get on with processing the data that is already available, in parallel
with a physical read to get more data. For example, the read call may be
being issued by a BufferedReader doing a readLine, and it can return
data to its caller as soon as it has a whole line.

Patricia
Roedy Green - 04 Jul 2007 19:38 GMT
>For example, the read call may be
>being issued by a BufferedReader doing a readLine, and it can return
>data to its caller as soon as it has a whole line.

even though we did double buffering and the like back in the days of
16K machines, I don't think java.io itself is that smart. I don't
think it is clever enough to read ahead another buffer why processing
the previous one , or letting your start processing lines before the
i/o completes.
--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
Jeff Higgins - 04 Jul 2007 20:45 GMT
>>>> Again, I'm sorry but I haven't been able to figure out what might
>>>> cause read(charBuffer) to not read all that could be read?
[quoted text clipped - 14 lines]
> being issued by a BufferedReader doing a readLine, and it can return
> data to its caller as soon as it has a whole line.

LOL :-) What us noobs won't go through to gain a little understanding!
Yup, during the course of this discussion I spent a good bit of energy
exploring some of the issues you describe. Mostly what I took away from it
was:
When using the basic IO facilities I should be concentrating on what I'm
hoping
to accomplish and not how the JVM is fetching bytes from whatever physical
medium.
What caused most of my confusion I suppose was the fact that I didn't have a
real
use-case in mind for this exploration. The OP wanted to know how to read the
contents of a file into a String, and I immediatly reacted by trying to find
a solution
to that problem when I may well have been better off asking "What am I
hoping
to accomplish here?". When given the advice, "This does not necessarily read
all
that could be read. Should be in a loop.", and after having consulted the
javadocs
my next question should probably have been: "Ok, now what?" instead of
"Well, why not?".
Anyway, it's been a pleasant line of inquiry, and fun.
Thanks for the response, much appreciated.
JH
Benjamin - 25 May 2007 00:15 GMT
On May 23, 9:56 pm, Knute Johnson <nos...@rabbitbrush.frazmtn.com>
wrote:
> > What's the best way to get the contents of a file represented by a
> > java.io.File object into a String?
>
> You don't specify what you consider best so how about simple as best?
> Now for my curiosity, why would you want to do this?
You're right I should be more specific. I did mean simplest. I am
writing a program which requires me to retrieve the full content of a
file.

> import java.io.*;
>
[quoted text clipped - 17 lines]
> Knute Johnson
> email s/nospam/knute/


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.