Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / April 2008

Tip: Looking for answers? Try searching our database.

Deflating a File

Thread view: 
Chase Preuninger - 17 Apr 2008 21:09 GMT
One thing I just noticed was that when I deflated a file full or
random bytes it actually increased its size.  Just thought that was
kind of neat.
Joshua Cranmer - 17 Apr 2008 21:57 GMT
> One thing I just noticed was that when I deflated a file full or
> random bytes it actually increased its size.  Just thought that was
> kind of neat.

Compression works by collapsing redundant sequences of information into
shorter bit strings, typically (though not always) through some sort of
dictionary system or run-length encoding. True random data cannot be
compressed most of the time; sufficiently pseudorandom data is also not
likely to be compressed well either.

Random data being compressed would be surprising.

Signature

Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

Owen Jacobson - 18 Apr 2008 14:23 GMT
> > One thing I just noticed was that when I deflated a file full or
> > random bytes it actually increased its size.  Just thought that was
[quoted text clipped - 7 lines]
>
> Random data being compressed would be surprising.

*Pseudo*random data can, of course, be compressed to a description
(goedel number?  source code?  whatever) of the algorithm plus the
initial conditions used to generate the values.  This suggests to me
that there might be a fixed (or almost fixed) amount of entropy in
PRNG output, regardless of how many digits of output you have.

Hmm.

-o
Andreas Leitgeb - 18 Apr 2008 15:01 GMT
> *Pseudo*random data can, of course, be compressed to a description
> (goedel number?  source code?  whatever) of the algorithm plus the
> initial conditions used to generate the values.  This suggests to me
> that there might be a fixed (or almost fixed) amount of entropy in
> PRNG output, regardless of how many digits of output you have.

If it's say 1e1000000000000000000 digits from the PRNG-Output, it's
not all that "regardless", as you'll need some extra bytes to encode
the actual length of the sequence :-)
rossum - 18 Apr 2008 17:08 GMT
>*Pseudo*random data can, of course, be compressed to a description
>(goedel number?  source code?  whatever) of the algorithm plus the
[quoted text clipped - 3 lines]
>
>Hmm.
Correct.  In cryptography a PRNG is used to stretch entropy.  A PRNG
is seeded from a true RNG and reseeded regularly to make sure that the
entropy is not stretched too thin.  A PRNG can only output as much
total entropy as is put in via the seed/reseed process.

rossum
Arne Vajhøj - 19 Apr 2008 18:14 GMT
>>> One thing I just noticed was that when I deflated a file full or
>>> random bytes it actually increased its size.  Just thought that was
[quoted text clipped - 12 lines]
> that there might be a fixed (or almost fixed) amount of entropy in
> PRNG output, regardless of how many digits of output you have.

True.

But current available compression algorithms does not analyze
data and detect the RNG algorithm. And I am pretty sure that
they will not in the future either.

So in practice the data does not compress.

Arne
Logan Shaw - 19 Apr 2008 20:36 GMT
>>> One thing I just noticed was that when I deflated a file full or
>>> random bytes it actually increased its size.  Just thought that was
[quoted text clipped - 12 lines]
> that there might be a fixed (or almost fixed) amount of entropy in
> PRNG output, regardless of how many digits of output you have.

Assuming that the PRNG's state isn't shared among multiple pieces of
code generating separate streams of data.  This could be a deterministic
process (no threads or other external entropy mixed in) but still screw
things up mightily.

I suppose you can view this issue as merely a parameterized PRNG in some
sense, I don't know.  I think I'm just wondering if it's fully general
to reason about a PRNG in the context of only one piece of code using it,
not two unrelated pieces of code sharing the same PRNG.

  - Logan
Patricia Shanahan - 17 Apr 2008 22:07 GMT
> One thing I just noticed was that when I deflated a file full or
> random bytes it actually increased its size.  Just thought that was
> kind of neat.

The job of a compression algorithm is to map some class of files to
shorter files. Since there is a fixed number of possible files of each
length, to achieve that it also has to map other files to longer files.

Patricia
Roedy Green - 18 Apr 2008 02:49 GMT
On Thu, 17 Apr 2008 13:09:21 -0700 (PDT), Chase Preuninger
<chasepreuninger@gmail.com> wrote, quoted or indirectly quoted someone
who said :

>One thing I just noticed was that when I deflated a file full or
>random bytes it actually increased its size.  Just thought that was
>kind of neat.

That's to be expected. You add the overhead without getting ANYTHING
back.  A  random file by definition can't be compressed.

Zipping a mess of zips buys you  a little not because the zip parts
compress, but because the filenames and other uncompressed parts
compress.
Signature


Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Logan Shaw - 19 Apr 2008 20:41 GMT
> On Thu, 17 Apr 2008 13:09:21 -0700 (PDT), Chase Preuninger
> <chasepreuninger@gmail.com> wrote, quoted or indirectly quoted someone
[quoted text clipped - 6 lines]
> That's to be expected. You add the overhead without getting ANYTHING
> back.  A  random file by definition can't be compressed.

This needs a little clarification to be unambiguously correct.

A *single* random file may be able to be compressed.  It's quite possible
my random number generator will a sequence of the same value 1000 times in
a row.  It's extraordinarily unlikely, but it's possible[1].

What's impossible is to produce a compressor that will compress *all*
random files.

  - Logan

[1]  If it weren't, then it wouldn't be a random sequence, because the
     generator was excluding certain output.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.