Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / February 2006

Tip: Looking for answers? Try searching our database.

incremental archive format for outputstream?

Thread view: 
NOBODY - 25 Feb 2006 14:04 GMT
Hi,

Do you know of an 'incremental' archive format that would be suited for an  
outputstream?

In other words, is there any archive format that can hold open the existing
entries and allow to append them in an interlaced fashion? (an
incrementally updating archive?)

Let me explain.

Let's say I have 2 types (A nd B) of csv data to send.
A1.csv, A2.csv, A3.csv
B1.csv, B2.csv, B3.csv

I want to write to a stream an archive format that will contain 2 entries
(A and B) where A is the contatenation of A1+A2+A3, and B is the
contatenation of B1+B2+B3.

Now, imagine a zip file. It is easy enough to create a new zip entry A, and
push all A1, A2, A3 files in sequence, and create a second zip entry B and
push B1, B2, B3.

But here is the problem: the sequence is rolling (like a log4j file-size-
rolling appender) and by the time I finished pushing A3, B1 be have rolled
off. I want to push A1 B1, A2 B2, A3 B3.

So, I cannot use java's zipfile, at least not that I know of, to "append
existing entry" instead of putNextEntry().

Something smart like gzip (where you can concatenate independant gzip files
and they become a valid single gzip file) only for multiple entries (that
gzip doesn't have) would be great!

Thanks.
Chris Uppal - 26 Feb 2006 11:23 GMT
> I want to write to a stream an archive format that will contain 2 entries
> (A and B) where A is the contatenation of A1+A2+A3, and B is the
> contatenation of B1+B2+B3.

I doubt if that's possible in any existing archive format.  Since the library
doesn't know how many "A" entries you are going to add, it doesn't know where
to put the "B" entries in the output file.

I suggest that you redesign.  One simple option would be to use two (or more)
output archives which you write concurrently.  A somewhat more complex, but
more elegant (IMO), option would be to layer your own "protocol" over an
existing archive format. So that you use what the archive code thinks of as
"files" as mere "chunks" in (logically) connected streams.

In the latter case, the archive would "think" that it contained:

   A.csv/A1.csv
   A.csv/A2.csv
   B.csv/B1.csv
   A.csv/A3.csv
   B.csv/B2.csv
   B.csv/B3.csv

but your code would interpret that as simply:

   A.csv
   B.csv

The ZIP file format (which has a table of contents) would be highly suitable
for the lower level of such a scheme, I think.  Note that you can use any names
you like for the entries in a ZIP file -- they don't have to be names of real
files (nor even valid filenames).

   -- chris
Andrey Kuznetsov - 26 Feb 2006 12:51 GMT
> I doubt if that's possible in any existing archive format.  Since the
> library
> doesn't know how many "A" entries you are going to add, it doesn't know
> where
> to put the "B" entries in the output file.

possible solution could be to keep table of contents in another file.

Signature

Andrey Kuznetsov
http://uio.imagero.com Unified I/O for Java
http://reader.imagero.com Java image reader
http://jgui.imagero.com Java GUI components and utilities



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.