i would like to use NIO to improve performance processing some very
large text files -- 1 to 4 GB. i have written my processor in
standard i/o and it's impractically slow.
the processing is line-oriented and not complicated, which is why i
believe that i/o is the performance bottleneck.
i just cannot 'get' the design process for using NIO, based on online
examples and reading the API. i could use some help.
i've figured out how to memory map a buffer from the input file, turn
it into a CharBuffer, read lines from it and do a simple test parse of
the lines (counting the occurrence of a regex match). i got most of
that from an example on the sun site.
but, how do i write out the results? i have managed to create an
output buffer from a RandomAccessFile object, but i just get a file of
empty bytes, the size of the buffer. somehow, i need to write chars
to that file, and i need to end up with a plain text file, the size of
which matches the amount of data written to it, not the size of the
memory buffer.
a further question is, how do i 'slide the window' along a multi-GB
file, reading in small pieces at a time for processing? and, finally,
how do i determine an optimal buffer size? is bigger better? on my
laptop, i have to fiddle the VM heap size and read buffer size to keep
from exhausting the heap. i'm not sure what is the best approach.
many thanks for help. i have spent a lot of hours this weekend just
to make this little progress.
thanks.
mp

Signature
Michael Powe michael@trollope.org Naugatuck CT USA
Is it time for your medication or mine?
Remon van Vliet - 10 Apr 2006 11:26 GMT
>i would like to use NIO to improve performance processing some very
> large text files -- 1 to 4 GB. i have written my processor in
[quoted text clipped - 30 lines]
>
> mp
It's unlikely NIO will solve anything for you. NIO adds (as the name
suggests) non blocking IO. Non-blocking IO helps an application use less
threads for a large number of concurrent IO channels. Since you said you
only work with one large file there will be no poerformance increase based
on non-blocking IO. I think it's highly likely you're doing something else
less efficient than strictly necessary. How about you post your read code
here?
Robert Klemme - 10 Apr 2006 12:01 GMT
> i would like to use NIO to improve performance processing some very
> large text files -- 1 to 4 GB. i have written my processor in
> standard i/o and it's impractically slow.
>
> the processing is line-oriented and not complicated, which is why i
> believe that i/o is the performance bottleneck.
Please profile your app to make sure that your believe and reality
actually match. Also, a good test is to just race through the file
reading bytes to determine how fast you can expect to go on your
hardware / OS.
Btw, from my experience (i.e. measurements) it's usually the fastest to
use a BufferedReader's readLine() when doing line oriented stuff. YMMV
though.
Kind regards
robert