Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / January 2006

Tip: Looking for answers? Try searching our database.

Gzip each chunk separately

Thread view: 
Lior Knaany - 02 Jan 2006 17:38 GMT
Hi all,

I need some help understanding chunked & gzipped data in HTTP/1.1
protocol.
Using headers like "Content-Encoding" vs. "Transfer-Encoding".
(doing this in order to develop a web server filter)

I noticed that when the server sends a Gzip content in chunks the
response headers will be as so :

"Content-Encoding: gzip
Transfer-Encoding: Chunked"

The browser waits for all the chunks, concates them together & runs
GUnZip on them to get the content.

But why Gzip the entire data before sending ? Is there a way that the
server can Gzip the chunk & then send it (doing the same for all the
chunks)?
Meaning the Gzip will not be on the entire content all together, but
for each chunk.
This way the browser could read one chunk, GUnZip it, display the
result & continue to the next chunk.

If there is a way, what should the response headers look like ?
Maybe like this:  "Transfer-Encoding: Gzip,Chunked" with no
Content-Encoding header?

I have searched "RFC 2616 - Hypertext Transfer Protocol -- HTTP/1.1
" but could not find any meaningful information for this question.

Please help,

Thanks in advance,
Lior.
Barry Margolin - 03 Jan 2006 05:42 GMT
> But why Gzip the entire data before sending ? Is there a way that the
> server can Gzip the chunk & then send it (doing the same for all the
[quoted text clipped - 3 lines]
> This way the browser could read one chunk, GUnZip it, display the
> result & continue to the next chunk.

Unless the chunks are really big, you're not going to get very good
compression that way.  Gzip uses an adaptive compression algorithm, so
it gets better as the amount of data increases.

But since gzip is also a stream compression algorithm, it can be done on
the fly as each chunk is sent and received.

Signature

Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

Lior Knaany - 03 Jan 2006 11:05 GMT
Thanks Barry,

I know that Gzip will work poorly on a smaller content, but can it be
done (gzip on each chunk seperatly)?
& if so, what should the headers look like ?
Chris Smith - 04 Jan 2006 00:01 GMT
> I know that Gzip will work poorly on a smaller content, but can it be
> done (gzip on each chunk seperatly)?
> & if so, what should the headers look like ?

No, it can't be done.  (Or rather, if you do it then general-purpose
browsers won't understand.)

Signature

www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Lior Knaany - 05 Jan 2006 17:44 GMT
Thanks Chris,

That is exactly what I am experiencing when producing such a page,
I just thought, maybe I am doing something wrong with the headers.

Well thanks again for the info Chris.
Michael Wojcik - 06 Jan 2006 15:15 GMT
> > I know that Gzip will work poorly on a smaller content, but can it be
> > done (gzip on each chunk seperatly)?
> > & if so, what should the headers look like ?
>
> No, it can't be done.  (Or rather, if you do it then general-purpose
> browsers won't understand.)

Though as Barry pointed out, you can achieve essentially the same
effect; neither the sender nor the receiver need buffer all the data
and compress or decompress it at once, since gzip is a streaming
compressor.

There's nothing to stop the server from reading N bytes of the file
it's sending, initializing the compressor, compressing those N bytes
to M bytes, sending an M-byte chunk, reading the next N bytes,
compressing those without reinitializing the compressor, and so
forth.  The receiver can treat that just as it would a content-body
that was compressed in its entirety before chunking. The only
difference, as far as the receiver can tell, is that the chunks will
probably vary in size if the sender compresses each chunk in turn.

By the same token, the receiver can initialize the decompressor
before processing the first chunk, then pass it each chunk as it's
received.  It needn't buffer the entire compressed content-body.

Signature

Michael Wojcik                  michael.wojcik@microfocus.com

I gave my love some irises.
(She was sick with viruses.)        -- Charlie Gibbs

Lior Knaany - 16 Jan 2006 10:16 GMT
Thanks Michael,

that was very enlightening
Rogan Dawes - 09 Jan 2006 07:48 GMT
>>I know that Gzip will work poorly on a smaller content, but can it be
>>done (gzip on each chunk seperatly)?
>>& if so, what should the headers look like ?
>
> No, it can't be done.  (Or rather, if you do it then general-purpose
> browsers won't understand.)

In fact, the gzip algorithm allows for indepently gzipped content to be
concatenated, and it will still unzip just fine.

$ echo file 1 > file1
$ echo file 2 > file2
$ gzip file1 file2
$ cat file1.gz file2.gz > file3.gz
$ gunzip file3.gz
$ cat file3
file 1
file 2
$

So, if you created a gzipped stream by concatenating gzipped output, the
browser SHOULD read it as the concatenation of the uncompressed files.

Regards,

Rogan
Chris Smith - 09 Jan 2006 16:49 GMT
> $ echo file 1 > file1
> $ echo file 2 > file2
[quoted text clipped - 5 lines]
> file 2
> $

Interesting...

Signature

www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Chris Uppal - 10 Jan 2006 11:13 GMT
[irrelevant and/or non-existent x-postings trimmed]

> In fact, the gzip algorithm allows for indepently gzipped content to be
> concatenated, and it will still unzip just fine.

More accurately, the gzip /program/ will act as you describe.   The compressed
format itself, the GZIP format as specified in RFC 1952, does naturally
concatenate, but only in the sense that a file in that format consists of a
number of elements, each of which is an independently compressed "file" (the
format even includes an embedded file name!).

It's difficult to state how a browser should interpret a gzip-format stream
which consists of several compressed elements.  If the browser's decompression
is based on the zlib library, then that library does not automatically hide the
boundaries between the separate "files" in the stream (and nor should it), so
it is quite possible -- even probable -- that the browser would stop
decompressing at the end of the first compressed "file" in the stream.

OTOH (reverting to the original poster's question), I don't see any reason why
the server cannot send chunked and compressed data, nor any reason (except,
perhaps, convenience) why the browser should not decompress such data
incrementally.  The underlying compression format (shared by "GZIP" and
"DEFLATE") is capable of being flushed and/or reset in mid-stream, so the
server could flush the compression algorithm at the end of each chunk, and that
would be transparent to the browser as it was decompressing it (assuming the
use of a library at least as well-designed as zlib).

In point of fact, however, I'm not sure I see any real reason why the server
should even bother to flush the compression algorithm -- it could just
accumulate compressed data until it had enough for one chunk (possibly leaving
some data in the compression code's buffers).  Send that as one chunk.  The
client would decompress in the same incremental way.

   -- chris


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.