Hi,
Do you know of an 'incremental' archive format that would be suited for an
outputstream?
In other words, is there any archive format that can hold open the existing
entries and allow to append them in an interlaced fashion? (an
incrementally updating archive?)
Let me explain.
Let's say I have 2 types (A nd B) of csv data to send.
A1.csv, A2.csv, A3.csv
B1.csv, B2.csv, B3.csv
I want to write to a stream an archive format that will contain 2 entries
(A and B) where A is the contatenation of A1+A2+A3, and B is the
contatenation of B1+B2+B3.
Now, imagine a zip file. It is easy enough to create a new zip entry A, and
push all A1, A2, A3 files in sequence, and create a second zip entry B and
push B1, B2, B3.
But here is the problem: the sequence is rolling (like a log4j file-size-
rolling appender) and by the time I finished pushing A3, B1 be have rolled
off. I want to push A1 B1, A2 B2, A3 B3.
So, I cannot use java's zipfile, at least not that I know of, to "append
existing entry" instead of putNextEntry().
Something smart like gzip (where you can concatenate independant gzip files
and they become a valid single gzip file) only for multiple entries (that
gzip doesn't have) would be great!
Thanks.
Chris Uppal - 26 Feb 2006 11:23 GMT
> I want to write to a stream an archive format that will contain 2 entries
> (A and B) where A is the contatenation of A1+A2+A3, and B is the
> contatenation of B1+B2+B3.
I doubt if that's possible in any existing archive format. Since the library
doesn't know how many "A" entries you are going to add, it doesn't know where
to put the "B" entries in the output file.
I suggest that you redesign. One simple option would be to use two (or more)
output archives which you write concurrently. A somewhat more complex, but
more elegant (IMO), option would be to layer your own "protocol" over an
existing archive format. So that you use what the archive code thinks of as
"files" as mere "chunks" in (logically) connected streams.
In the latter case, the archive would "think" that it contained:
A.csv/A1.csv
A.csv/A2.csv
B.csv/B1.csv
A.csv/A3.csv
B.csv/B2.csv
B.csv/B3.csv
but your code would interpret that as simply:
A.csv
B.csv
The ZIP file format (which has a table of contents) would be highly suitable
for the lower level of such a scheme, I think. Note that you can use any names
you like for the entries in a ZIP file -- they don't have to be names of real
files (nor even valid filenames).
-- chris
Andrey Kuznetsov - 26 Feb 2006 12:51 GMT
> I doubt if that's possible in any existing archive format. Since the
> library
> doesn't know how many "A" entries you are going to add, it doesn't know
> where
> to put the "B" entries in the output file.
possible solution could be to keep table of contents in another file.

Signature
Andrey Kuznetsov
http://uio.imagero.com Unified I/O for Java
http://reader.imagero.com Java image reader
http://jgui.imagero.com Java GUI components and utilities