Java Forum / General / April 2006
Why is JAR so slow?
Wibble - 31 Mar 2006 13:21 GMT I've given up on it and started using gzip. Its about a bazillion times faster, going from minutes to seconds for some our large code generated jars.
Anyone else find this?
Ravi - 31 Mar 2006 16:20 GMT How will you execute the application getting from gzip?
Roedy Green - 31 Mar 2006 19:26 GMT >I've given up on it and started using >gzip. Its about a bazillion times >faster, going from minutes to seconds >for some our large code generated jars. They do quite different things. Gzip compresses an entire file. Jar prepares many small member entries each separately compressed.
Are you referring to some alternate Jar utility?
Also check out pack200 which compresses an entire uncompressed archive to get super compression.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Wibble0@gmail.com - 31 Mar 2006 20:00 GMT > >I've given up on it and started using > >gzip. Its about a bazillion times [quoted text clipped - 12 lines] > Canadian Mind Products, Roedy Green. > http://mindprod.com Java custom programming, consulting and coaching. I'm talking about the plain old jar utility. I'm creating jar files to use in a classpath. I'm not interested that they're executable.
If you add a manifest, gzip makes jar files which java happily accepts. The jar file creation time is what I find crazy with the jar util that ships java.
I'm pretty sure that the reason jikes is faster is that it doesn't use java's jar support but embeds native zip.
IchBin - 31 Mar 2006 20:11 GMT >>> I've given up on it and started using >>> gzip. Its about a bazillion times [quoted text clipped - 22 lines] > I'm pretty sure that the reason jikes is faster is that it doesn't use > java's jar support but embeds native zip. Just and aside question? I thought the jikes was not being developed at IBM any more or they are phasing it out..
 Signature Thanks in Advance... IchBin, Pocono Lake, Pa, USA http://weconsultants.servebeer.com/JHackerAppManager __________________________________________________________________________
'If there is one, Knowledge is the "Fountain of Youth"' -William E. Taylor, Regular Guy (1952-)
Wibble0@gmail.com - 31 Mar 2006 20:14 GMT > >>> I've given up on it and started using > >>> gzip. Its about a bazillion times [quoted text clipped - 35 lines] > 'If there is one, Knowledge is the "Fountain of Youth"' > -William E. Taylor, Regular Guy (1952-) Jikes is on sourceforge now, no longer ibm.
http://jikes.sourceforge.net/
IchBin - 31 Mar 2006 23:07 GMT > Jikes is on sourceforge now, no longer ibm. > > http://jikes.sourceforge.net/ Thanks for the info..
IchBin, Pocono Lake, Pa, USA http://weconsultants.servebeer.com/JHackerAppManager __________________________________________________________________________
'If there is one, Knowledge is the "Fountain of Youth"' -William E. Taylor, Regular Guy (1952-)
Roedy Green - 31 Mar 2006 20:22 GMT >If you add a manifest, gzip makes jar files which java happily accepts. >The jar file creation time is what I find crazy with the jar util that >ships java. To me, GZip means GZIPOutputStream. What does it mean to you?
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Wibble0@gmail.com - 31 Mar 2006 20:17 GMT > >I've given up on it and started using > >gzip. Its about a bazillion times [quoted text clipped - 12 lines] > Canadian Mind Products, Roedy Green. > http://mindprod.com Java custom programming, consulting and coaching. I meant zip, not gzip. Sorry.
Roedy Green - 31 Mar 2006 20:27 GMT >I meant zip, not gzip. Sorry. You mean winzip, PkZip, 7Zip, WinRar?
Zip utilitiies don't build manifests. I gather you are doing that manually?
Zip utilities are highly tuned native assembler code. So it is no surprise that the generic code in jar.exe is much slower.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Wibble0@gmail.com - 31 Mar 2006 20:45 GMT > >I meant zip, not gzip. Sorry. > [quoted text clipped - 8 lines] > Canadian Mind Products, Roedy Green. > http://mindprod.com Java custom programming, consulting and coaching. I'm using solaris & cygwin zip. I'm creating manifests manually.
I'm exec'ing the zip from ant, as opposed to using ant's jar task.
<macrodef name="fastJar"><!-- jar is slow, so zip --> <attribute name="jarFile"/> <attribute name="basedir"/> <attribute name="javaVersion" default="1.4.2"/> <attribute name="zipdirs" default="com"/> <sequential> <dirname file="@{jarFile}" property="dir_@{jarFile}"/> <echo message="dbg:fastJar @{jarFile} @{baseDir} @{zipdirs}"/> <mkdir dir="@{basedir}/META-INF"/> <property name="metaInf.@{jarFile}" value=" META-INF"/> <echo file="@{basedir}/META-INF/MANIFEST.MF" message="Manifest-Version: 1.0${line.separator}Created-By: @{javaVersion} (Sun Microsystems Inc.)${line.separator}"/> <echo message="exec: zip -q -r @{jarFile}${metaInf.@{jarFile}} @{zipdirs}"/> <dirname property="dir.@{jarFile}" file="@{jarFile}"/> <mkdir dir="${dir.@{jarFile}}"/> <tempfile property="tmpJar.@{jarFile}" destDir="${dir.@{jarFile}}" prefix="tmpJar" suffix=".zip"/> <exec executable="zip" failonerror="true" dir="@{basedir}"> <arg line="-q -r ${tmpJar.@{jarFile}}${metaInf.@{jarFile}} @{zipdirs}"/> </exec> <move file="${tmpJar.@{jarFile}}" tofile="@{jarFile}"/> </sequential> </macrodef>
Mike Schilling - 01 Apr 2006 16:08 GMT >>I meant zip, not gzip. Sorry. > [quoted text clipped - 5 lines] > Zip utilities are highly tuned native assembler code. So it is no > surprise that the generic code in jar.exe is much slower. By generic you mean "written in C"? All of the file manipulation in java.util.{jar,zip} is done in native methods.
Wibble - 01 Apr 2006 17:56 GMT >>>I meant zip, not gzip. Sorry. >> [quoted text clipped - 8 lines] > By generic you mean "written in C"? All of the file manipulation in > java.util.{jar,zip} is done in native methods. So if its native, why is it soooooo slow?
Stefan Ram - 01 Apr 2006 18:02 GMT >So if its native, why is it soooooo slow? The Just-in-time optimizer of the JVM does not apply to native code.
Roedy Green - 01 Apr 2006 19:13 GMT >> By generic you mean "written in C"? All of the file manipulation in >> java.util.{jar,zip} is done in native methods. >> >So if its native, why is it soooooo slow? It is likely not why it is so slow but why the competition is so fast. Those assembler algorithms have been fine tuned in a proprietary ways since the DOS days.
The other thing is how much work did Sun do tuning the compression code to particular CPUs/platforms. Likely not much. Jar.exe for most people is adequate. If they were to invest time, they would optimise the other end, unpacking jars, which is time critical.
I suspect they did native code simply to use an existing C library. Today's optimisers could probably beat the C code.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Wibble - 01 Apr 2006 19:37 GMT >>>By generic you mean "written in C"? All of the file manipulation in >>>java.util.{jar,zip} is done in native methods. [quoted text clipped - 12 lines] > I suspect they did native code simply to use an existing C library. > Today's optimisers could probably beat the C code. We compile to and run out of jars. It was a big time sink waiting for things to jar. Opening jars is a runtime cost, but creating them is a development cost, and we all know development time is valuable.
Unless you're creating lots of jars, a jit optimizer might not kick in in time unless you were creating alot of jars.
I've never really know what the boundaries and triggers are for the jit compiler. Will it compile a method while its executing? When it sees alot of calls to a function?
Sun certainly has access to the solaris zip source. It should have been easier to snarf that instead of writing a slow version.
Roedy Green - 01 Apr 2006 19:07 GMT On Sat, 01 Apr 2006 15:08:01 GMT, "Mike Schilling" <mscottschilling@hotmail.com> wrote, quoted or indirectly quoted someone who said :
>By generic you mean "written in C"? All of the file manipulation in >java.util.{jar,zip} is done in native methods. the time consuming thing is the compression algorithm. Making this fast is the key reason to pick one compression utility over another. See http://mindprod.com/jgloss/compressionutilities.html
Is the compression in jar.exe done in native? If so, it likely is not hand-tuned assembler though.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Mike Schilling - 01 Apr 2006 19:15 GMT > On Sat, 01 Apr 2006 15:08:01 GMT, "Mike Schilling" > <mscottschilling@hotmail.com> wrote, quoted or indirectly quoted [quoted text clipped - 9 lines] > Is the compression in jar.exe done in native? If so, it likely is not > hand-tuned assembler though. The compression methods in java.util.zip.Deflater are native. How they're implemented, I can't say.
Chris Uppal - 02 Apr 2006 11:42 GMT > The compression methods in java.util.zip.Deflater are native. How they're > implemented, I can't say. I believe that it uses the same code as is used in Cygwin (and presumably Solaris) zip. http://www.gzip.org/zlib/ http://www.info-zip.org/pub/infozip/Zip.html
I feel the difference Wibble's seeing must due to different file handling and/or bufferig strategies. Unfortunately, I haven't (yet?) been able to find the source to the jar program in the platform/JVM source.
I suppose another possibility is that Wibble's Ant tasks are kicking off an execution of the jar tool for each file that's added (seems unlikely, but you never know), in which case a large part of the difference would be down to the difference in startup times between a Cygwin exe (bad), and a JVM-based exe (much worse).
-- chris
Wibble - 01 Apr 2006 19:30 GMT > On Sat, 01 Apr 2006 15:08:01 GMT, "Mike Schilling" > <mscottschilling@hotmail.com> wrote, quoted or indirectly quoted [quoted text clipped - 9 lines] > Is the compression in jar.exe done in native? If so, it likely is not > hand-tuned assembler though. Roedy, can you add jar to you're compression numbers. I'd be interested in seeing how it stacks up.
Roedy Green - 01 Apr 2006 23:36 GMT >Roedy, can you add jar to you're compression numbers. >I'd be interested in seeing how it stacks up. done.
See http://mindprod.com/jgloss/compressionutilities.html
In compressing, 7zip shrinks to 42% of original size. Winzip 58%, Jar 68%.
In speed, jar is 68% the speed of Winzip.
All this is under Win2k.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Wibble - 02 Apr 2006 03:46 GMT >>Roedy, can you add jar to you're compression numbers. >>I'd be interested in seeing how it stacks up. [quoted text clipped - 10 lines] > > All this is under Win2k. Thanks R
Martin Gregorie - 02 Apr 2006 12:05 GMT >> Roedy, can you add jar to you're compression numbers. >> I'd be interested in seeing how it stacks up. > > done. Useful stuff. Thanks.
In view of the preceding discussion, please consider adding similar decompression figures. If nothing else, it would be very interesting indeed to compare the compression/decompression times each utility.
I don't think I'm too surprised that Windows takes longer to copy the file than zip does to compress it. This is true of any OS on a system with a fairly quick CPU and a fast compression algorithm. Straight copy does two disk operations per sector while writing a compressed archive only does 1.3 to 1.5 disk operations for each sector read.
This also means that, if JAR unpack is optimized, class loading from JAR files could be faster than from .class files.
I've not seen recent benchmarks, but historically Windows i/o has never had the world's quickest i/o, primarily because it never used to overlap peripheral i/o with anything else, i.e. there is no overlap between devices or with moving the data between the i/o system and the application. Any OS using this approach will have significantly faster i/o times with compressed files simply because disk transfers are minimized.
 Signature martin@ | Martin Gregorie gregorie. | Essex, UK org |
Chris Uppal - 02 Apr 2006 12:37 GMT > I don't think I'm too surprised that Windows takes longer to copy the > file than zip does to compress it. This is true of any OS on a system > with a fairly quick CPU and a fast compression algorithm. Straight copy > does two disk operations per sector while writing a compressed archive > only does 1.3 to 1.5 disk operations for each sector read. And even just for input, reducing the amount to data to stream off-disk can be a win. I've used Windows "compressed folders" before to speed up (by a factor of nearly two) a program that had to suck in a /lot/ of data.
BTW, another thing that can slow down copying a ZIP file is if you have a virus checker active. On one system I worked on (where I was using ~ 1GB ZIP files /compressed/), renaming a ZIP could take tens of minutes while the background VC scanned through all the entries....
-- chris
Wibble - 02 Apr 2006 16:25 GMT >>I don't think I'm too surprised that Windows takes longer to copy the >>file than zip does to compress it. This is true of any OS on a system [quoted text clipped - 12 lines] > > -- chris The problem comes about when jar'ing up lots of small files, which is typical of our application. The problem is observed on both Cygwin, and Solaris. Monday I'll post actual numbers.
Ant does not spawn jar tasks, but uses the same vm for ant and its subordinate java tasks like jar or javac. I would expect jar to have a headstart since it doesn't have to spawn a process. Its no better without ant.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|