hi All,
I have some data stored in the database as blob. I can read this blob
into byte array using jdbc. Now I need to read the byte array using
InputStream. The problem is I want to know if the data in the database
is stored in comprressed form. If the byte array is compressed I have
to use InflaterInputStream. other wise I can use ObjectInputStream.
Before using which stream to use with the byte array I want to know if
the byte array is compressed or not?
Any idea how this can be done??
thanks in advance,
Prakasan
Ingo R. Homann - 03 Aug 2005 09:43 GMT
Hi,
> hi All,
>
[quoted text clipped - 8 lines]
>
> Any idea how this can be done??
I am not familar with the InflaterInputStream, and you did not say much
about the kind of compression. In fact, *every* byte array can be viewed
as 'compressed' if you specify an adequate compression algorithm. So,
without knowing the algorithm, your problem cannot be solved.
Having said this, look at some of your byte arrays in detail. Normally,
because of what I said above, most compression algorithms add some kind
of header to mark the byte array as 'compressed by this algorithm'.
Hth,
Ingo
PS: Note that even if a byte array starts with "Compressed with GZIP
Version 3.45", this might be a coincidental sequence of bytes! :-)
So, If you *really* want to be sure, it would be a good idea to add a
header in any case (even, if the byte array is not compressed).
Antti S. Brax - 03 Aug 2005 14:02 GMT
ihomann_spam@web.de wrote in comp.lang.java.programmer:
>> I have some data stored in the database as blob. I can read this blob
>> into byte array using jdbc. Now I need to read the byte array using
[quoted text clipped - 4 lines]
>> Before using which stream to use with the byte array I want to know if
>> the byte array is compressed or not?
<snip>
> PS: Note that even if a byte array starts with "Compressed with GZIP
> Version 3.45", this might be a coincidental sequence of bytes! :-)
> So, If you *really* want to be sure, it would be a good idea to add a
> header in any case (even, if the byte array is not compressed).
My first suggestion would be to add a table column which
indicates the compression algorithm (if any) used in the
data. If you can't change the database schema, then consider
adding a header. But then be aware that the data will become
useless for programs that don't understand your magic header.

Signature
Antti S. Brax Rullalautailu pitää lapset poissa ladulta
http://www.iki.fi/asb/ http://www.cs.helsinki.fi/u/abrax/hlb/
[1385 messages expunged from folder "Spam"]
Andrew Thompson - 03 Aug 2005 10:00 GMT
> hi All,
Hello again.. Please refrain from multi-posting.
<http://www.physci.org/codes/javafaq.jsp#xpost>
Also, please be clear that you are posting to Usenet, not Google.
<http://www.physci.org/codes/javafaq.jsp#usenet>

Signature
Andrew Thompson
physci.org 1point1c.org javasaver.com lensescapes.com athompson.info
Beats A Hard Kick In The Face
John Currier - 03 Aug 2005 14:27 GMT
> I have some data stored in the database as blob. I can read this blob
> into byte array using jdbc. Now I need to read the byte array using
[quoted text clipped - 4 lines]
> Before using which stream to use with the byte array I want to know if
> the byte array is compressed or not?
A simplistic approach is to just try to decompress it. If that fails
then it's not compressed (or is corrupted). This simplistic approach
assumes that the majority of the stored data is compressed. The
approach can be slightly optimized by checking the header yourself
before attempting a decompression. I've used this technique in
interceptors for decompressing CORBA traffic.
The only real risk is of successfully decompressing something that
wasn't compressed.
Antti's approach, however, is cleaner.
John
http://schemaspy.sourceforge.net
Harald - 03 Aug 2005 22:07 GMT
> hi All,
>
[quoted text clipped - 6 lines]
> Before using which stream to use with the byte array I want to know if
> the byte array is compressed or not?
The entropy of compressed data tends to be higher than that of
uncompressed data. But this is just a statistical observation.-)
Strange setup you have there where you don't know what kind of data
format you are dealing with.
Harald.

Signature
---------------------+---------------------------------------------
Harald Kirsch (@home)|
Java Text Crunching: http://www.ebi.ac.uk/Rebholz-srv/whatizit/software
Roedy Green - 04 Aug 2005 00:24 GMT
>Any idea how this can be done??
but a boolean on the front of the stream to tell you is the easiest
way.
I am beginning to realize the wisdom of putting a field on the front
of any stream giving the version. Otherwise it becomes impossible to
deal with old format files later on.
Other that that, look at the two files with hex viewer to see if
the9re is a recognisable signature.
see http://mindprod.com/jgloss/hex.html

Signature
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm
Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes