Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / January 2008

Tip: Looking for answers? Try searching our database.

Questions about buffered streams

Thread view: 
failure_to@yahoo.co.uk - 07 Jan 2008 01:19 GMT
hello

Sorry for so many questions, but I think I/O is one topic that will
give me troubles for quite some time

1)
What makes read and write operations time consuming ( or at least more
time consuming than calls to ordinary, non IO methods? )?

a) The fact that underlying stream classes make calls to system
libraries?

b)
* Buffer classes are supposed to help us with that, since they buffer
data and thus don’t necessarelly call underlying system for each byte.
But non the less, even if buffered stream doesn’t immediately call
underlying system in order to write a byte, it still has to call
underlying system ( say it buffers 5 bytes of data and at some point
we call flush() ) once for each byte   just as non buffered stream
classes have to… so in the end, same amount of time was spent to write
those five bytes to a file as if we’d write those 5 bytes with non
buffered stream … only difference being that those 5 bytes were
written at once and not every time write() was called?!

* Or, can buffered streams somehow call underlying system’s method
just once and with just that one call write all five bytes of data?

c)
Even if buffered byte streams can somehow write all 5 bytes with one
system call, that shouldn’t be true for  BufferedWriter streams?!
BufferedWriter stream  doesn’t directly talk to underlying byte
stream, so I assume it would still take 5 system calls to write those
five bytes of data? So no time was saved!

2)
FileInputStream       FS = new FileInputStream ( “A.txt” );
BufferedInputStream BS = new BufferedInputStream ( FS );
DataInputStream      DS = new DataInputStream ( BS );

Of the three objects above, I assume only the byte stream objects keep
some sort of internal pointer which keeps track of which bytes in a
stream were already written/read and thus advances this pointer with
each read or write operation? Wrapper objects ( BS and DS in the above
example ) don’t have such internal pointers?!

3)
If you use say Fileoutputstream method write( int buf[]…), does it act
like kind of buffer and reads all those bytes with one system call, or
does it make one system call for each byte read?

4)
BufferedReader in = new
             BufferedReader( new FileReader("foo.in") );

Does even simple in.read() without any parameters specified  causes
wrapped FileReader object to read more than just one character from
underlying byte stream?

5)
Next questions are about PrintStream class. Here is what Java docs and
my book have to say about this class:

“All characters printed by a PrintStream are converted into bytes
using the platform's default character encoding. “

I assume the text is referring to cases where we don’t specify type of
encoding in a constructor, since if we do specify which encoding to
use, then PrintStream converts characters into bytes using specified
encoding and not platform’s default character encoding?!

“For real-world programs, the recommended method of
writing to the console when using Java is through a PrintWriter
stream. PrintWriter is one of the character-based classes. Using a
character-based class for console output makes it easier to
internationalize your program.”

“The PrintWriter class should be used in situations that require
writing characters rather than bytes.”

* Why should PrintWriter be used instead in situations that require
writing characters instead of bytes?

* How does PrintWriter make it easier to internationalize a program?

* When dealing with characters, when and why ( or why not ) would you
choose PrintWriter over some other character based stream ( like
OutputStreamWriter )?

thank you

cheers
Silvio Bierman - 07 Jan 2008 02:09 GMT
> hello
>
[quoted text clipped - 87 lines]
>
> cheers

You asked too many questions to be answered individually, especially
because they stack on top of each other. I will try to give a brief
explanation and suggest you do some reading/googling.

The system calls you refer to primarily come down to the same two system
calls that can read/write n bytes from/to what is often called a file
descriptor. In C these system calls would be

int read(int fd,char *buff,int nbytes);
int write(int fd,char *buff,int nbytes);

For any Java implementation the basic systems routines might look
entirely different but this should be sufficiently accurate.

This answers your question about buffered streams, they read/write more
optimal sized blocks of bytes in one system call than the individual
read/write calls performed on the stream.

Readers/writers add the abstraction of characters and encodings of
characters into bytes. At the end they need a stream to read/write bytes.

Basically this is all you need but if you want to be able to write
strings, integers etc. to some character oriented output then you will
need some formatting logic and that is what a PrintWriter will do for you.

A DataInputStream or a DataOutputStream is something that adds binary IO
of Strings, integers etc. to a byte oriented stream.

This all fits together nicely. In combination of the java.text.XXXFormat
classes you have a rather complete set of basic IO tools in the Java SDK.

I would suggest a good Java textbook or the Sun website to learn more.

Good luck,

Silvio Bierman
Roedy Green - 07 Jan 2008 05:28 GMT
>What makes read and write operations time consuming ( or at least more
>time consuming than calls to ordinary, non IO methods? )?

they require mechanical motion of disk head, and waiting for disk
surfaces to spin under the read head, and for the data to mechanically
pass by the read head.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 06:12 GMT
>b)
>* Buffer classes are supposed to help us with that, since they buffer
[quoted text clipped - 7 lines]
>buffered stream ?K only difference being that those 5 bytes were
>written at once and not every time write() was called?!

If you wrote 1 byte at a time, you would have to wait for the spot on
disk to spin round for each byte. If you write 64,000 bytes at a time,
you only have to wait once for the proper spot on disk to spin round.

see http://mindprod.com/jgloss/buffer.html
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 06:13 GMT
>Even if buffered byte streams can somehow write all 5 bytes with one
>system call, that shouldn??t be true for  BufferedWriter streams?!
>BufferedWriter stream  doesn??t directly talk to underlying byte
>stream, so I assume it would still take 5 system calls to write those
>five bytes of data? So no time was saved!

when you write to a buffer, the buffer class only writes to the system
when the buffer is full, or when you flush or close.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 06:18 GMT
>FileInputStream       FS = new FileInputStream ( ??A.txt?? );
>BufferedInputStream BS = new BufferedInputStream ( FS );
[quoted text clipped - 5 lines]
>each read or write operation? Wrapper objects ( BS and DS in the above
>example ) don??t have such internal pointers?!

All streams keep track internally of how many bytes have been read
both on disk and from the buffer.  RandomAccessFiles also keep track
of where you are in the file, but explicitly with getFilePointer and
seek.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

EJP - 08 Jan 2008 03:27 GMT
> All streams keep track internally of how many bytes have been read
> both on disk and from the buffer.

No they don't.

> RandomAccessFiles also keep track of where you are in the file,
> but explicitly with getFilePointer and seek.

No they don't. The operating system does that.
failure_to@yahoo.co.uk - 08 Jan 2008 17:23 GMT
Roedy Green - 09 Jan 2008 06:55 GMT
On Tue, 08 Jan 2008 03:27:29 GMT, EJP
<esmond.not.pitt@not.bigpond.com> wrote, quoted or indirectly quoted
someone who said :

>> All streams keep track internally of how many bytes have been read
>> both on disk and from the buffer.
>
>No they don't.

Check out the code for BufferedReader.skip.  An IDE like IntelliJ Idea
will take you to the source from any reference with Ctrl-B.  It HAS to
track where it is in the buffer because the OS knows nothing about the
buffer.

Check out the code for FileInputStream.read (the basic read routine).
It is a native method.  It might or might not maintain a mirror copy
of the physical cursor position.  I don't see how you can be so
certain.  It could be different on different platforms.  In any case,
logically read does track the physical cursor location.  Consider that
Java could be implemented on a file system without sequential files,
just random access. This would be disguised in the native methods.

Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 06:20 GMT
>3)
>If you use say Fileoutputstream method write( int buf[]?K), does it act
>like kind of buffer and reads all those bytes with one system call, or
>does it make one system call for each byte read?

You can example the source for yourself in src.zip.  It will write all
the bytes in one go.  I do a lot of I/O megabytes at a pop and it is
very fast, certainly not a byte at a time.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 06:21 GMT
>4)
> BufferedReader in = new
[quoted text clipped - 3 lines]
>wrapped FileReader object to read more than just one character from
>underlying byte stream?

Buffered readers will either:

1. satisfy the request from the buffer.
2. read a buffer full, then satisfy the request.
3. read to the tail end of the file if it can't get a whole buffer
full, then satisfy the request.

Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 06:23 GMT
>??All characters printed by a PrintStream are converted into bytes
>using the platform's default character encoding. ??

Inside the program you are using 16-bit Unicode.  Your platform
typically supports 8-bit text files.  What encoding depends where you
live.  See http://mindprod.com/jgloss/encoding.html

PrintStream automatically encodes to the local 8-bit encoding,
However, you can explicitly choose the encoding, .e.g. UTF-8 or even
UTF-16.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 06:24 GMT
>??The PrintWriter class should be used in situations that require
>writing characters rather than bytes.??

PrintWriters are for writing text files. They have translation going
on. This would confound efforts to compose binary bytes. For than use
DataOutputStream.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 06:25 GMT
>* How does PrintWriter make it easier to internationalize a program?

If you have an explicit encoding, you can put that in your
internationolisation configurion file.

see http://mindprod.com/jgloss/internationalisation.html
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 06:26 GMT
>* When dealing with characters, when and why ( or why not ) would you
>choose PrintWriter over some other character based stream ( like
>OutputStreamWriter )?

PrintWriter gives you extra methods, mostly println which will insert
a platform specific line ending.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

failure_to@yahoo.co.uk - 07 Jan 2008 18:45 GMT
hello

On Jan 7, 7:12 am, Roedy Green <see_webs...@mindprod.com.invalid>
wrote:
> On Sun, 6 Jan 2008 17:19:16 -0800 (PST), failure...@yahoo.co.uk
> wrote, quoted or indirectly quoted someone who said :
[quoted text clipped - 17 lines]
> bytes at a time, you only have to wait once for the proper spot
> on disk to spin round.

a) So in essence, when flush() is called, buffered stream calls
underlying byte stream's write() and this write() method calls
underlying system just once and with that one call transfers all of
64000 bytes( meaning, write() is not called 64000 times )?

b)
FileOutputStream fo = new FileOutputStream( "A.txt" );
fo.write(byte_1);
fo.write(byte_2);
.
.
.
fo.write(byte_64000);

So in theory, above code would write those 64000 bytes to a system in
aprox the same amount of time as Buffered stream would, assuming no
other thread blocks this output stream?
I'm assuming this since:

* first write() call ( fo.write(byte_1) ) causes disk to spin to
appropriate spot
* since after the first write() call disk is at the appropriate spot,
the disk doesn't have to rotate for the subsequent 63999 write()
calls

> >Even if buffered byte streams can somehow write all 5 bytes
> >with one system call, that shouldn??t be true for
[quoted text clipped - 6 lines]
> system when the buffer is full, or when you flush or close.
> --

So when BufferedWriter flushes its data, the procedure is the same as
when BufferedOutputStream flushes its data ( meaning it takes same
amount of time to write those 64000 bytes to the file )?

> >How does PrintWriter make it easier to internationalize a
> >program?
>If you have an explicit encoding, you can put that in your
> internationolisation configurion file.

So only advantage of Printwriter over PrintStream ( when dealing with
characters ) is internationalization?

> >When dealing with characters, when and why ( or why not )
> >would you choose PrintWriter over some other character based
> >stream ( like OutputStreamWriter )?
>PrintWriter gives you extra methods, mostly println which will
>insert a platform specific line ending.

* While other output character streams only insert platform specific
line ending when newline() is called?

* Don't character streams also have a method which writes a full line
and that automatically adds native newline sequence? I'm asking this
cos I can't find one.

> > FileInputStream     FS = new FileInputStream ( ??A.txt?? );
> > BufferedInputStream BS = new BufferedInputStream ( FS );
[quoted text clipped - 10 lines]
>keep track of where you are in the file, but explicitly with
>getFilePointer and seek.

Yes, but only FileOutputStream stream knows the TOTAL offset from the
beginning of the file ( from the time we first started reading the
file )?!

thank you
Roedy Green - 07 Jan 2008 19:49 GMT
>a) So in essence, when flush() is called, buffered stream calls
>underlying byte stream's write() and this write() method calls
>underlying system just once and with that one call transfers all of
>64000 bytes( meaning, write() is not called 64000 times )?

If you had a buffer of 64K, no matter how small the pieces you wrote,
no physical I/O would happen until you called flush or close if the
total size were under 64K.  If you wrote more than 64K, you would get
a physical write when you filled the first 64K.

Did you read my essay at http://mindprod.com/jgloss/buffer.html

If you did, please read it again and tell me where you got confused.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 07 Jan 2008 19:50 GMT
>So in theory, above code would write those 64000 bytes to a system in
>aprox the same amount of time as Buffered stream would, assuming no
>other thread blocks this output stream?
>I'm assuming this since:

yes. however this is still some overhead for each call to write to
copy the bytes to the buffer.  It has to check if the buffer is full
etc.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Daniel Pitts - 07 Jan 2008 20:00 GMT
On Jan 7, 11:50 am, Roedy Green <see_webs...@mindprod.com.invalid>
wrote:
> On Mon, 7 Jan 2008 10:45:10 -0800 (PST), failure...@yahoo.co.uk wrote,
> quoted or indirectly quoted someone who said :
[quoted text clipped - 7 lines]
> copy the bytes to the buffer.  It has to check if the buffer is full
> etc.
Actually, the disk is constantly spinning, so if you don't write in a
complete block, the disk may pass the position you want it to write to
before you write, so it would in effect be in the *worst* position for
the write.

Not to mention that typically disk IO happens in Sectors or Clusters,
which are usually at least 512 bytes long.  Unless the OS itself does
some caching, writing one byte at a time is actually a read of 512
bytes, update of *that* buffer, and a write of 512 bytes.  As you can
imagine, this is highly inefficient.

Something else to note is that you're discussion so far has assumed
Disk IO operations, but there are other forms of IO, including network
IO.  Writing one byte at a time to a Socket stream can result in a lot
of overhead for the underlying protocols.  I think that TCP/IP has a
minimum of something like 38 bytes, not to mention the ethernet and OS
overhead.
failure_to@yahoo.co.uk - 07 Jan 2008 22:05 GMT
hello

> >a) So in essence, when flush() is called, buffered stream
> >calls underlying byte stream's write() and this write() method
[quoted text clipped - 6 lines]
>64K, you would get a physical write when you filled the first
>64K.

I realize that!

>Did you read my essay at http://mindprod.com/jgloss/buffer.html
>If you did, please read it again and tell me where you got
>confused.

I'm not sure how you got the impression that article  got me
confused?

> >So in theory, above code would write those 64000 bytes to a
> >system in aprox the same amount of time as Buffered stream
[quoted text clipped - 3 lines]
>to copy the bytes to the buffer.  It has to check if the buffer
>is full etc.

Are you talking about the code below or about buffered streams? I know
from your article that using buffers can cause some overhead due to
bytes being copied to buffer, but from my understanding fo object
doesn't buffer these bytes, but sends them directly to the system ... so
in theory ( well mine, much flawed theory )the below code should write
those bytes to the system faster than buffered stream would ( assuming
the disk isn't constantly spinning :) ... which apparently it is )

FileOutputStream fo = new FileOutputStream( "A.txt" );
fo.write(byte_1);
fo.write(byte_2);
.
.
.
fo.write(byte_64000);

> Actually, the disk is constantly spinning, so if you don't write
> in a complete block, the disk may pass the position you want it
[quoted text clipped - 7 lines]
> write of 512 bytes.  As you can imagine, this is highly
> inefficient.

but if the disk wasn't constantly spinning then

FileOutputStream fo = new FileOutputStream( "A.txt" );
fo.write(byte_1);
fo.write(byte_2);
.
.
.
fo.write(byte_64000);

would be just as efficient as if buffered stream flushed those 64000
bytes?
Lew - 08 Jan 2008 01:41 GMT
Roedy Green wrote:
>> Did you read my essay at http://mindprod.com/jgloss/buffer.html
>> If you did, please read it again and tell me where you got
>> confused.

> I'm not sure how you got the impression that article  got me
> confused?

Perhaps it was your assertion that 64K individual writes of one byte would
proceed faster than a single write of 64K bytes that gave that impression.

>>> So in theory, above code would write those 64000 bytes to a
>>> system in aprox the same amount of time as Buffered stream
>>> would, assuming no other thread blocks this output stream?

No.

64K one-byte writes will be *much* slower than one 64 KB write.

> Are you talking about the code below or about buffered streams? I know
> from your article that using buffers can cause some overhead due to
[quoted text clipped - 3 lines]
> those bytes to the system faster than buffered stream would ( assuming
> the disk isn't constantly spinning :) ... which apparently it is )

No.

Even assuming you're writing to a disk, your Java write operation is so far
removed from "spinning" that it isn't even remotely useful to think of it in
those terms.

You have Java flushing to a system buffer, which writes to a driver, which
loads data onto a disk-controller cache if there is one, which loads data onto
the disk's own cache, which loads data onto the disk platter(s).  Assuming no
RAID, which adds some overhead of multiple-disk synchronization.  OSes have
'fsync' and such modes that determine if writes go all the way to platter
before reporting completion, which may or may not be engaged.

Anyway, each individual write has to go through all those layers - 64000 times
for one byte apiece will always lose to 64KB through the gate in a single rush.

Signature

Lew

Roedy Green - 09 Jan 2008 07:54 GMT
>Perhaps it was your assertion that 64K individual writes of one byte would
>proceed faster than a single write of 64K bytes that gave that impression.

Here is how I have rewritten the section.  I hope this makes
everything clear:

Because of hard disk latency, when you do I/O, it will go faster if
you do it in a few big physical I/O chunks rather than a number of
small ones. If you wrote data one byte at a time, you would have to
wait for the disk arms to snap to the correct cylinder, and for the
platter to rotate round the correct spot every time you wrote a byte.
If you buffered at 64,000 characters, you would have to do this wait
only once every 64,000 characters. Mechanical motion is in the order
of 1000 times slower than electronics.

If you wrote a byte at a time, since the hardware works in 512-byte
sectors at a time, the OS would need to read the sector, plop your
byte into it and write the entire sector back. This would take at
least 2 disk rotations, perhaps 3. Even if you wrote your data 512
bytes at a time, when you went to write the next sector, its spot
would have just past the head, so you would have to wait an entire
rotation for its spot to come round. If you wrote 131,072 bytes (still
less than 1 physical track) at a pop, you could do that all in one
rotation.

Ideally, if you have enough RAM, you do the I/O in one whacking huge
file-sized unbuffered chunk. Java has a number of classes that let you
process a file buffered in convenient small logical chunks, often line
by line. The buffered classes transparently handle the physical I/O in
bigger chunks, typically 4096 bytes. The classes store each large
chunk for physical I/O in a separate piece of RAM called a buffer.
Unless the buffer size for the physical I/O is at least twice as big
as the size of the logical chunks you process, there is not much point
in buffering. The extra buffering copying overhead will just slow you
down.

The File I/O Amanuensis will teach you how to do I/O either buffered
or unbuffered. You can try it both ways, and see which works faster.
You can also experiment with buffer sizes. The bigger the buffer, the
fewer the physical I/Os you need to process the file. However, the
bigger the buffer, the more virtual RAM you will use, which may
trigger more swapping I/O. Further, there is not much point in having
a whacking big buffer for a tiny file. It will take only a few I/Os to
process the file anyway.

You will find that buffer sizes that are a power of two tend to work
faster than other sizes. This is because disk and RAM hardware are
designed around some magic sizes, typically 256, 512, 1024, 2048,
4096, 8192, 16,384, 32,768, 65,536, 131,072 and 262,144 bytes. Buffers
that are powers of two naturally do I/O in physical chunks that align
on powers of two boundaries in the file. This too makes the I/O more
efficient because the hardware works typically in 512 byte sector
chunks. If you do unbuffered I/O, likewise try to start your I/Os on
boundaries that are even multiples of some power of two, the higher
the power of two the better. e.g. it is better to start I/O on
boundaries that are even multiples of 8096 rather than just 128.
Sometimes it pays to pad your fixed-length records up to the next
power of two. If you can help it, arrange your logical record size and
buffer size so that logical records are aligned so that they never (or
rarely) span two buffer fulls. It also helps to have your buffers
aligned on physical RAM addresses that are even powers of two as well,
though you have no control of that in Java.

In the olden days, CØBØL programs used double buffering. They used two
or more buffers per file. The computer would read ahead filling
buffers while the program was busy processing one of the previous
buffers. Oddly, Java does not support this efficient serial processing
technique, though sometimes the operating system maintains its own
private set of read-ahead buffers behind the scenes. Unfortunately,
the OS's cascaded buffering is less efficient than using a single
layer. You have the overhead of copying plus the wasted RAM for the
buffers that are not actually used for physical I/O. Java never has
more than one buffer per file and hence cannot simultaneously process
and do physical I/O, unless of course it uses Threads. Even with
Threads, you can’t pull off double buffering with any ease.

The term double buffering also refers to a technique of constructing
Images off screen then blasting them onscreen once they are complete,
as a way of creating smoother animation.

If you wrote 128K a byte at a time using a 64K buffer there would be
only two physical 64K I/Os. This would be slightly slower that using
unbuffered I/O to write the entire 128K in one I/O because of the
extra physical I/O, the RAM overhead for the buffer and the CPU
overhead of copying the data to the buffer.

When To Buffer

To process a file whole file a time, read the entire file in one giant
unbuffered I/O.

If a file is too large to process all in RAM, read it buffered, and
process it a chunk, line, field or char at a time.

To copy files or download streams use the FileTransfer class which
reads unbuffered a large chunk at a time.

If you need the readLine method, you must use buffering.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Lars Enderin - 09 Jan 2008 08:24 GMT
Roedy Green skrev:

> would have just past the head, so you would have to wait an entire

s/past/passed/
Christian - 09 Jan 2008 13:39 GMT
Roedy Green schrieb:

> If you wrote a byte at a time, since the hardware works in 512-byte
> sectors at a time, the OS would need to read the sector, plop your
[quoted text clipped - 5 lines]
> less than 1 physical track) at a pop, you could do that all in one
> rotation.

I doubt it is that simple with a modern OS.
As discs have large caches that buffer read/write operations.
The OS has a cache that does additional buffering shure these caches may
be slower than your buffer that may reside in the cache of the cpu...
but that doesn't mean you can measure or explain the latency of writing
single bytes with hdd rotation.
Lew - 09 Jan 2008 15:50 GMT
Roedy Green schrieb:

>> If you wrote a byte at a time, since the hardware works in 512-byte
>> sectors at a time, the OS would need to read the sector, plop your
[quoted text clipped - 5 lines]
>> less than 1 physical track) at a pop, you could do that all in one
>> rotation.

> I doubt it is that simple with a modern OS.
> As discs have large caches that buffer read/write operations.
> The OS has a cache that does additional buffering shure these caches may
> be slower than your buffer that may reside in the cache of the cpu...
> but that doesn't mean you can measure or explain the latency of writing
> single bytes with hdd rotation.

Let us not forget the effect of file systems.  A journaling file system will
add more physical writes to the logical writes that Java requests, further
complicating matters.  And we aren't talking RAID, even.  As others have
pointed out, the issues pertain if disks aren't even involved, as with TCP/IP
streams.

It is next to useless to talk about platters and heads and disk spin in a Java
context.  Just about any IO Stream will behave better with larger chunks, up
to a point, even if it's only because of the CPU chip's own internal memory
cache.  Memory accesses are striped, too.

The rule of thumb is that a write() carries overhead.  The penalty of that
overhead is reduced with a larger payload - the Automated Teller Machine (ATM)
fee effect.  The larger the transaction, the smaller the fee in proportion to it.

For just about all practical IO Streams, the write() overhead is large enough
to make that 64KB go much faster as one write than as 64K individual one-byte
writes.  Disks, platters and heads are not even in that overhead any more [1]
- it's all OS, file-system and driver in-memory overhead and cache accesses,
mobo and outboard both.

[1] for the large category of applications not requiring guaranteed writes
(e.g., not RDBMSes).

Signature

Lew

John W. Kennedy - 09 Jan 2008 19:47 GMT
> In the olden days, CØBØL programs used double buffering. They used two
> or more buffers per file. The computer would read ahead filling
[quoted text clipped - 4 lines]
> the OS's cascaded buffering is less efficient than using a single
> layer.

Double buffering is an operating-system feature on IBM mainframes
(actually, these days, it's more likely to be quintuple buffering),
having nothing much to do with the language a program is written in.
Similarly not having double buffering is more a function of *ix and
Windows than of Java or C.

> To process a file whole file a time, read the entire file in one giant
> unbuffered I/O.

Or use MappedByteBuffer.

> To copy files or download streams use the FileTransfer class which
> reads unbuffered a large chunk at a time.

File copying is most efficiently performed by the transferTo and
transferFrom methods of FileChannel.

Signature

John W. Kennedy
"The whole modern world has divided itself into Conservatives and
Progressives. The business of Progressives is to go on making mistakes.
The business of the Conservatives is to prevent the mistakes from being
corrected."
  -- G. K. Chesterton

Roedy Green - 09 Jan 2008 07:22 GMT
>>Did you read my essay at http://mindprod.com/jgloss/buffer.html
>>If you did, please read it again and tell me where you got
>>confused.
>
>I'm not sure how you got the impression that article  got me
>confused?

I thought I had answered all your question in the essay.  You were
still asking questions.  That implied something in the essay was
inadequate or confusing.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 09 Jan 2008 07:27 GMT
>.
>.
>fo.write(byte_64000);
>
>would be just as efficient as if buffered stream flushed those 64000
>bytes?

for large chucks UNbuffered streams are more efficient, though they do
the same thing physically. If you turn off buffering you save the
copying and RAM for buffers.

1. Try to read the file all in one go, so long as it is small enough
to process that way.  Use UNbuffered.

2. Otherwise buffer, and read a chunk, line, field or char at a time.

3. Use code like than in FileTransfer for bulk copying files or
downloaded streams.  It reads a large chunk at a time unbuffered.

4. Buffering is needed for readLine.

Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green - 09 Jan 2008 07:33 GMT
>but if the disk wasn't constantly spinning then

then no writing at all would be possible.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.