Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / March 2006

Tip: Looking for answers? Try searching our database.

what happens to buffer ?

Thread view: 
gk - 30 Mar 2006 06:34 GMT
byte[] buffer = new byte[512];
int read;
while ((read=in.read(buffer)) >0) {
  out.write(buffer, 0, read);
}

in the first iteration, the buffer is filled up with bytes.

what happens in the next iteration ?

does buffer first cleared off and then filled up afresh

OR

the buffer is overwriiten with the new incoming bytes ?
Patricia Shanahan - 30 Mar 2006 06:43 GMT
> byte[] buffer = new byte[512];
> int read;
[quoted text clipped - 11 lines]
>
> the buffer is overwriiten with the new incoming bytes ?

If "in" is an InputStream reference, the InputStream javadoc covers this
in detail.

Patricia
gk - 30 Mar 2006 06:50 GMT
> If "in" is an InputStream
Yes. you are right.

javadoc says

"public int read(byte[] b)
        throws IOException

   Reads some number of bytes from the input stream and stores them
into the buffer array b. The number of bytes actually read is returned
as an integer. This method blocks until input data is available, end of
file is detected, or an exception is thrown.

   If b is null, a NullPointerException is thrown. If the length of b
is zero, then no bytes are read and 0 is returned; otherwise, there is
an attempt to read at least one byte. If no byte is available because
the stream is at end of file, the value -1 is returned; otherwise, at
least one byte is read and stored into b.

   The first byte read is stored into element b[0], the next one into
b[1], and so on. The number of bytes read is, at most, equal to the
length of b. Let k be the number of bytes actually read; these bytes
will be stored in elements b[0] through b[k-1], leaving elements b[k]
through b[b.length-1] unaffected.

   If the first byte cannot be read for any reason other than end of
file, then an IOException is thrown. In particular, an IOException is
thrown if the input stream has been closed.

   The read(b) method for class InputStream has the same effect as:

read(b, 0, b.length)

"

my question is

what happens in the next iteration ?

does buffer first cleared off and then filled up afresh

OR

the buffer is overwriiten with the new incoming bytes ?

javadoc  does not answer this question.
Patricia Shanahan - 30 Mar 2006 06:54 GMT
>>If "in" is an InputStream
>
[quoted text clipped - 43 lines]
>
> javadoc  does not answer this question.

Yes it does, because it does not say "This is the behavior for the first
iteration only". The material you quoted applies to every call to
InputStream's read with a byte buffer, regardless of whether it is the
first call with that buffer or not.

Patricia
Oliver Wong - 30 Mar 2006 18:36 GMT
[post re-ordered]

> my question is
>
[quoted text clipped - 7 lines]
>
> javadoc  does not answer this question.

It does:

> Let k be the number of bytes actually read; these bytes
> will be stored in elements b[0] through b[k-1], leaving elements b[k]
> through b[b.length-1] unaffected.

   - Oliver
Roedy Green - 30 Mar 2006 07:00 GMT
>in the first iteration, the buffer is filled up with bytes.
>
[quoted text clipped - 5 lines]
>
>the buffer is overwriiten with the new incoming bytes ?

why would it matter? You know how many bytes there are when you are
done.  If you are just curious, have a look at SRC..ZIP and failing
that the sun source codes.  See http://mindprod.com/jgloss/jdk.html
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Thomas Schodt - 30 Mar 2006 08:25 GMT
> byte[] buffer

a byte array reference "buffer"

> byte[] buffer = new byte[512];

is assigned to reference a (new) byte array of 512 bytes.
Initially these bytes will all contain 0 (ascii NUL).

> int read;
> while ((read=in.read(buffer)) >0) {

here between 1 and 512 bytes of
the byte array referenced by "buffer"
are "filled" with byte values
starting from - the start of the byte array.

>    out.write(buffer, 0, read);
> }
[quoted text clipped - 8 lines]
>
> the buffer is overwritten with the new incoming bytes ?

between 1 and 512 bytes of
the byte array referenced by "buffer"
are "filled" with byte values
starting from - the start of the byte array.

Maybe you read about nio ByteBuffer
and you are confusing the two?
gk - 30 Mar 2006 10:43 GMT
more confused with those answers.

here i am explaining the problem in a nice way

say,  in the first iteration ,there was 10 bytes in the stream.(because
its streaming and bytes might come slowly slowly)

so  10 bytes  is read by the read()  method  and going  to the  buffer.

write method uses this byte buffer.

Now, in the second iteration say, there is 25 bytes in the stream

so, so  25  bytes  is read by the read()  method  and going  to the
buffer.

but  in the first iteration buffer had 10 bytes ....what  will happen
to those 10 bytes now ?

does those will be cleared off first and then 25 bytes would be placed.

OR

the buffer would be  completely overwritten with these new coming 25
bytes ?

> > byte[] buffer
>
[quoted text clipped - 33 lines]
> Maybe you read about nio ByteBuffer
> and you are confusing the two?
Chris Uppal - 30 Mar 2006 11:00 GMT
> the buffer would be  completely overwritten with these new coming 25
> bytes ?

That's correct.  The first 25 bytes of the buffer would be overwritten.  The 10
bytes from the previous read() would be lost.

(How could the second call to read() "know" that a previous call had put 10
bytes into the buffer ?  And even if it did know, why should it care?
Presumably if the programmer hadn't wanted to overwrite the existing data, then
s/he would have used the longer form of read() which takes an argument to say
where in the buffer to start writing.)

   -- chris
gk - 30 Mar 2006 11:30 GMT
what should be the size of buffer ?

is   byte[] buffer = new byte[512];   ENOUGH ?

suppose, at some point of time  huge  number of bytes (say 1000 bytes)
stormed .

Then what will happen ?  the buffer cant accept more than 512 bytes
.....will the additional bytes 1000-512 =  488 will still be in the
stream ? or they will be lost ?

the reason is, some people use

byte[] buffer = new byte[256];
byte[] buffer = new byte[512];
byte[] buffer = new byte[1024];

which one is good ?

or    anything is ok . does it  matter really ?  does the coder
responsible for choosing the size of the byte ?
Gordon Beaton - 30 Mar 2006 11:48 GMT
> what should be the size of buffer ?
>
[quoted text clipped - 17 lines]
> or    anything is ok . does it  matter really ?  does the coder
> responsible for choosing the size of the byte ?

Each time you call read(), the new bytes are written at the start of
the buffer unless you tell read() to do otherwise. If there were
already some data in the buffer from a previous read, it will be
overwritten with the new data.

Also, read() will never read more than the number of characters you
request, or the length of the buffer if you don't specify. Note that
read() can and often will return *fewer* characters than you request,
so you need to check the return value.

Any bytes you don't read will wait nicely in the stream until you
choose to read them.

So you can decide to read as much or as little as you want each time,
and can choose an apropriate buffer size. Normally it's more efficient
to read a lot of data each time and in powers of two, but depending on
your application you may want to read less.

/gordon

Signature

[  do not email me copies of your followups  ]
g o r d o n + n e w s @  b a l d e r 1 3 . s e

Patricia Shanahan - 30 Mar 2006 15:57 GMT
...
> So you can decide to read as much or as little as you want each time,
> and can choose an apropriate buffer size. Normally it's more efficient
> to read a lot of data each time and in powers of two, but depending on
> your application you may want to read less.

Why the preference for powers of two?

Patricia
Remon van Vliet - 30 Mar 2006 16:09 GMT
> ...
>> So you can decide to read as much or as little as you want each time,
[quoted text clipped - 5 lines]
>
> Patricia

Because it looks cool i think, but other than that i can think of exactly
zero reasons to make buffers a power of 2.
Gordon Beaton - 30 Mar 2006 16:32 GMT
> Why the preference for powers of two?

When reading from a stream that maps to a file, I believe that read
efficiency is improved by aligning reads to OS buffer sizes and
ultimately file system or NFS block sizes. AFAIK all of these are
normally powers of 2.

I suppose if you're reading from a TCP stream, then multiples of MSS
bytes might be more appropriate.

Superstition? Maybe.

/gordon

Signature

[  do not email me copies of your followups  ]
g o r d o n + n e w s @  b a l d e r 1 3 . s e

Remon van Vliet - 30 Mar 2006 16:40 GMT
>> Why the preference for powers of two?
>
[quoted text clipped - 9 lines]
>
> /gordon

Even if that were so, you'd need to know the actual buffer sizes of said OS
to have a noticable improvement though, and there's a fair chance even then
the difference is negligable. It's often more useful to adjust the buffer
size to something sensible for said application. All that said, i always
allocate general purpose buffers to be a power of 2...there's something
appealing to the numbers 512 and 4096....maybe i'm just weird
Oliver Wong - 30 Mar 2006 18:38 GMT
> ...
>> So you can decide to read as much or as little as you want each time,
[quoted text clipped - 3 lines]
>
> Why the preference for powers of two?

   Because it has always been done that way. Do not question your elders!

   - Oliver
Roedy Green - 30 Mar 2006 18:56 GMT
>> So you can decide to read as much or as little as you want each time,
>> and can choose an apropriate buffer size. Normally it's more efficient
>> to read a lot of data each time and in powers of two, but depending on
>> your application you may want to read less.
>
>Why the preference for powers of two?

Physical i/o is done in terms of some power of two, often 512 bytes If
your buffer is a nice muliple, physical i/o can do direct to it.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Chris Smith - 30 Mar 2006 19:03 GMT
> ...
> > So you can decide to read as much or as little as you want each time,
[quoted text clipped - 3 lines]
>
> Why the preference for powers of two?

Because they are more exciting.  Many developers, myself included, are
quite reluctant to give up this remaining connection to our industry's
more technical past.

Seriously, I can't think of a reason.

Signature

www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Roedy Green - 30 Mar 2006 23:06 GMT
>Seriously, I can't think of a reason.

I am very surprised so many people did not immediately say:

if your buffer is not a multiple of 512, then the OS is going to have
to allocate its own buffer to read the multiple of 512 and copy the
bytes. If your buffer is a multiple of 512, there is a good chance it
can do the I/O directly into your buffer.  Physical i/o is done in
terms of disk sectors.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Patricia Shanahan - 31 Mar 2006 03:23 GMT
>>Seriously, I can't think of a reason.
>
[quoted text clipped - 5 lines]
> can do the I/O directly into your buffer.  Physical i/o is done in
> terms of disk sectors.

I've done that sort of thing where I knew enough about disk transfer
sizes and alignment requirements, and that the system would use it that way.

However, even java.nio buffers are only claimed to be suitable for
direct I/O if they are allocated by the ByteBuffer allocateDirect
factory method. Are you sure the JVM does direct I/O to ordinary byte
arrays?

Patricia
Roedy Green - 31 Mar 2006 04:07 GMT
> Are you sure the JVM does direct I/O to ordinary byte
>arrays?

The JVM likely has nothing to do with it. At the OS level you tell the
OS to deliver X bytes from offset X in the file to offset Y in your
buffer.  

If the OS is clever it has prefetched those bytes and copied them to
your buffer.

A long time ago hardware insisted on reading block and plopping them
at 512 byte boundaries.  I don't think that is still so, but I think
it may still be so that physical buffers need to be multiples of 512.
In Jet, buffers are automatically aligned on paragraph (16 byte
boundaries)

In benchmarking, you must watch out that you don't  reuse the same
file  since any reasonably decent OS will soon cache it.Once it is
cached magic buffer sizes would no longer apply, at least until we get
special hardware for copying pages around.

It is not just Windows you are talking about, but ancient old OS's
like IBM's that support Java.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

James McGill - 31 Mar 2006 02:39 GMT
> Why the preference for powers of two?

Systems that live close to the architecture often find performance
benefits, if not hard requirements, to align things on boundaries.  It's
quite possible that you might find low-level disc seeks that are not
capable of seeking to an arbitrary address, but instead, deal in offsets
from some block boundary -- and that will invariably be divided into
some power of two.

But in this case, it's not at all clear, if it's even defined, whether
it matters, or if there's any performance implication at all, or if the
compiler or bytecode machine aligns them for you anyway, or if it would
be more efficient to use a prime number instead of a power of two, or
anything else about it.  It's not a common thing to divide a buffer by
two, or to arrange buffers for best fit in a larger "power-of-two"
block, or to deal separately with "high and low half-buffers", or
anything of this nature.  

It appears this is a historical idiom, not of the language, but of the
programmers.  But it's hardly coincidental.  Everything digital is
organized in finite quantities, every resource being bounded by some
power of two.  

Maybe the next generation will revisit the merits of this whole "binary"
thing, and something better will emerge.  When it does, do you think we
will have to throw away everything we know about discrete math?  

In the meantime, I'll bet a dollar it does not matter whether you make
your buffers 2000, 2047, 2048, or 2049 bytes.  (And I'll gladly pay up
if someone can show me metrics that show otherwise!)
Roedy Green - 31 Mar 2006 03:22 GMT
On Thu, 30 Mar 2006 18:39:19 -0700, James McGill
<jmcgill@cs.arizona.edu> wrote, quoted or indirectly quoted someone
who said :

>In the meantime, I'll bet a dollar it does not matter whether you make
>your buffers 2000, 2047, 2048, or 2049 bytes.  (And I'll gladly pay up
>if someone can show me metrics that show otherwise!)

On the other paw, you might as well use powers of two since you have
no evidence that avoiding them is better.  Those are the natural  size
containers programmers think in.

Are you sure than even nio does not like buffers multiples of page
frames? It seems highly unlikely.

If I am going to take you up on your bet, I want to find out in
advance what you would consider "cheating".  If I find even one OS
where it matters do I get your dollar?

And how much of a percentage in speed additional by using magic
multiples do I have to get to count as faster?

Am I allowed to use Jet, Java 1.6, -server?
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

James McGill - 31 Mar 2006 05:27 GMT
> If I find even one OS
> where it matters do I get your dollar?

If it's BSDI you only get a Canadian Dollar.  If it's SCO, you owe me a
dollar and shame on you for having a SCO box :-)  
Chris Uppal - 30 Mar 2006 14:05 GMT
> Then what will happen ?  the buffer cant accept more than 512 bytes
> .....will the additional bytes 1000-512 =  488 will still be in the
> stream ? or they will be lost ?

The extra bytes remain in the stream until you are ready to read them.

> the reason is, some people use
>
[quoted text clipped - 3 lines]
>
> which one is good ?

It doesn't matter very much.  In theory the larger the buffer the higher the
potential speed, but in practise I just choose a number like 4096 and don't
worry about it.

The reason it can be faster is that IF each call to read() ends up calling the
similar function in the underlying OS, then there's a certain fixed overhead
per call.  So the more data you read in one call, the less the overhead when
averaged over all the bytes you read.

Note that that doesn't apply if you are using a BufferedInputStream (or
something like it) because it does the buffering for you.  That way you can
read tiny little chunks at a time (or even single bytes at a time) with very
little effect on performance.

   -- chris
Thomas Schodt - 30 Mar 2006 12:21 GMT
gk wrote (edited):

> in the first iteration, there were 10 byte values in the stream.
> (because its streaming and bytes might come slowly slowly)
[quoted text clipped - 15 lines]
> 10 byte values were stored in the byte array.
> what will happen to those 10 byte values now ?

They are lost.

> are those cleared first and then 25 byte values would be placed.
>
> OR
>
> the buffer would be completely overwritten with these new coming 25
> bytes ?

What is the difference?
Is there a difference? Not that I know of.

Re 'clear' - Are you asking if read() first stores zeroes in all the
bytes of the byte array?
No, it does not.

Re 'new bytes' - Are you asking if the byte array referenced by "buffer"
is replaced with a new byte array?
No, it is not. "buffer" still references the same byte array only now
some of the bytes of the byte array have new values.

What are you asking?
Patricia Shanahan - 30 Mar 2006 15:32 GMT
> more confused with those answers.
>
[quoted text clipped - 6 lines]
>
> write method uses this byte buffer.

"The first byte read is stored into element b[0], the next one into
b[1], and so on. The number of bytes read is, at most, equal to the
length of b. Let k be the number of bytes actually read; these bytes
will be stored in elements b[0] through b[k-1], leaving elements b[k]
through b[b.length-1] unaffected."

Assume all 10 bytes are read by the first call, k is 10 and b is your
byte buffer. Following the read call, elements 0 through 9 of your
buffer contain the 10 bytes of read data. Elements 10 through 511 still
contain whatever they contained before the read call.

> Now, in the second iteration say, there is 25 bytes in the stream
>
> so, so  25  bytes  is read by the read()  method  and going  to the
> buffer.

"The first byte read is stored into element b[0], the next one into
b[1], and so on. The number of bytes read is, at most, equal to the
length of b. Let k be the number of bytes actually read; these bytes
will be stored in elements b[0] through b[k-1], leaving elements b[k]
through b[b.length-1] unaffected."

Assume all bytes are read by the second call, k is 25 and b is your
byte buffer. Following the read call, elements 0 through 24 of your
buffer contain the 25 bytes of read data. Elements 25 through 511 still
contain whatever they contained before the read call.

The repetition of the quote is deliberate. That paragraph tells you what
happens to each element of your buffer, regardless of whether it is the
first read call or the millionth read call.

> but  in the first iteration buffer had 10 bytes ....what  will happen
> to those 10 bytes now ?

Suppose you had written:

byte[0] = 7;

followed some time later by

byte[0] = 23;

What happens to the 7? That is what happens to the first 10 elements,
when you do a read that gets at least 10 bytes of data.

> does those will be cleared off first and then 25 bytes would be placed.

I don't see any reason for a prior clear operation, rather than just
writing the new data over the old.

Patricia


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.