Java Forum / General / May 2007
Java (bytecode) execution speed
Lee - 29 Apr 2007 19:25 GMT All other things being equal, we expect an interpreted language to run a bit slower than native machine code.
I understand that in the beginning, the earliest version of the JVM was perceived to be demonstrably slower than a coresponding C program; but that more recent versions have impreved coniderably.
Whats the current state of the art? Would we expect a java program to run at 0.5 * the speed of C, or 0.7 or 0.9 or what?
Obviously it all depends on what you're doing and how you're doing it, but still, there must be some rough rule of thumb as to what's a reasonable expectation for how by Java program should run compared to the equivalent C routine.
I'm asking programmers, rather than the "announcement/advocacy" groups in the hopes you will have a more realistic idea of how things actually work under the sphere of the moon.
Chris Smith - 29 Apr 2007 19:37 GMT > All other things being equal, we expect an interpreted language to run a > bit slower than native machine code. Okay, but there are no interpreted languages discussed in the rest of your post. All common implementations of Java use a JIT compiler, which should lead to to no foregone conclusions about whether it will be faster or slower than a compiled language. There are reasons it's likely to be slower (the compiler must run quickly, so it won't do a lot of global optimization), and reasons that today's very sophisticated JIT compilers for Java are likely to run faster (because it runs at runtime, the compiler can take advantage of statistical information about how users are using the application right now, and optimize them as the fast cases).
If you are looking into performance numbers for Java, you are making a mistake by focusing on the bytecode execution model. In the end, it's about a wash -- no loss or gain there. The really important stuff is in garbage collection and management of the heap. Java's language definition is such that there are often far more heap allocations than a typical C program, and they are far shorter lived. Different heap management techniques are used to adapt to that situation. This affects the performance of Java applications quite a lot.
On average, the best heap allocation management (including garbage collection) algorithms out there perform slightly worse than a C-style explicit heap with free lists. Of course, there are applications where it performs much better because it reduces the amount of computation that goes into book-keeping; and there are applications where it performs much worse if the lifecycles of objects in the application are unusual. (Some poorly performing cases, should you be interested, are when a large number of objects live just ever so slightly long enough to make it out of the nursery, or when there are lots of long-lived objects that contain frequently modified references to very short-lived objects.)
> Whats the current state of the art? Would we expect a java program to > run at 0.5 * the speed of C, or 0.7 or 0.9 or what? I'd say anywhere from 0.7 to 1.1 times the speed of C would be a reasonable guess, depending mainly on the nature of object lifetimes. But it's just that; a guess.
 Signature Chris Smith
Stefan Ram - 29 Apr 2007 19:51 GMT >If you are looking into performance numbers for Java, you are >making a mistake by focusing on the bytecode execution model. [quoted text clipped - 3 lines] >often far more heap allocations than a typical C program, and >they are far shorter lived. Two quotations regarding gargabe collection:
»Your essay made me remember an interesting phenomenon I saw in one system I worked on. There were two versions of it, one in Lisp and one in C++. The display subsystem of the Lisp version was faster. There were various reasons, but an important one was GC: the C++ code copied a lot of buffers because they got passed around in fairly complex ways, so it could be quite difficult to know when one could be deallocated. To avoid that problem, the C++ programmers just copied. The Lisp was GCed, so the Lisp programmers never had to worry about it; they just passed the buffers around, which reduced both memory use and CPU cycles spent copying.«
<XNOkd.7720$zx1.5584@newssvr13.news.prodigy.com>
»A lot of us thought in the 1990s that the big battle would be between procedural and functional programming, and we thought that functional programming would provide a big boost in programmer productivity. I thought that, too. Some people still think that. It turns out we were wrong. Functional programming is handy dandy, but it's not really the productivity booster that was promised. The real significant productivity advance we've had in programming has been from languages which manage memory for you automatically. It can be with reference counting or garbage collection; it can be Java, Haskell, Visual Basic (even 1.0), Smalltalk, or any of a number of scripting languages. If your programming language allows you to grab a chunk of memory without thinking about how it's going to be released when you're done with it, you're using a managed-memory language, and you are going to be much more efficient than someone using a language in which you have to explicitly manage memory. Whenever you hear someone bragging about how productive their language is, they're probably getting most of that productivity from the automated memory management, even if they misattribute it.«
http://www.joelonsoftware.com/articles/APIWar.html
Stefan Ram - 29 Apr 2007 19:54 GMT Supersedes: <heap-20070429204231@ram.dialup.fu-berlin.de>
I would like to add two quotations to this thread:
»Your essay made me remember an interesting phenomenon I saw in one system I worked on. There were two versions of it, one in Lisp and one in C++. The display subsystem of the Lisp version was faster. There were various reasons, but an important one was GC: the C++ code copied a lot of buffers because they got passed around in fairly complex ways, so it could be quite difficult to know when one could be deallocated. To avoid that problem, the C++ programmers just copied. The Lisp was GCed, so the Lisp programmers never had to worry about it; they just passed the buffers around, which reduced both memory use and CPU cycles spent copying.«
<XNOkd.7720$zx1.5584@newssvr13.news.prodigy.com>
»A lot of us thought in the 1990s that the big battle would be between procedural and object oriented programming, and we thought that object oriented programming would provide a big boost in programmer productivity. I thought that, too. Some people still think that. It turns out we were wrong. Object oriented programming is handy dandy, but it's not really the productivity booster that was promised. The real significant productivity advance we've had in programming has been from languages which manage memory for you automatically.«
http://www.joelonsoftware.com/articles/APIWar.html
Supersedes: <heap-20070429204231@ram.dialup.fu-berlin.de>
Arne Vajhøj - 29 Apr 2007 22:37 GMT > All other things being equal, we expect an interpreted language to run a > bit slower than native machine code. Since a Java is not interpreted but JIT compiled, then that point is not so relevant.
> I understand that in the beginning, the earliest version of the JVM was > perceived to be demonstrably slower than a coresponding C program; but > that more recent versions have impreved coniderably. > > Whats the current state of the art? Would we expect a java program to > run at 0.5 * the speed of C, or 0.7 or 0.9 or what? My expectations would be within the range 0.75-1.25 !
The variations between different C compilers with difference settings and different JVM's with different switches and the differences for different tasks is so big that the language difference is insignificant.
Arne
Lee - 30 Apr 2007 16:48 GMT > All other things being equal, we expect an interpreted language to run a > bit slower than native machine code. <SNIP>
At least two people were kind enough to point out that Java uses a JIT compilation system and that in any case the difference of execution time beween a Java progam and a hypothetical compiled version of the same algorithm (or as similar as the two languages allow), would probably be due more to the differences in heap managemant and/or garbage collection than in raw compilation speed.
In that context it becomes plausible that in some circumstances Java might actually run faster than an equivalent C/C++ implementation.
I must be missing an important nuance about Java and JIT.
Perhaps someone can "debug" me on this:
I had thought that bytcode was, so to speak, the "machine code" of the Java Virtual machime. If that were true, I can't see how there would be any room (or any need) for further compilation of the byte code. The byte code itelf would "drive" the VM, taking the VM from internal state to internal state until the computation was done.
But if the bytecode were just a portable abstraction, something "above" the JVM's machine language but "below" the java source language, that would create the need to compile the byte code "the rest of the way down" to the actual jvm machine language, but NOT to native hardware machine language.
So even in that case, the "compilation" would be down to the VM's machine language, not the actual hardware's machine language.
But all the descriptions I see on the net about JIT talk in terms of compilation to native machine language. I can see how that would work with somethink like say the Pascal p-system, where pascal source would be compiled into "p-code", and then the p-code would either be interpreted or "just-in-time" compiled to native hardware machine language.
My problem is that in my conception, when it is a question of running a virtual machine, the "compilation" would be to that vm's "machine" language and thats as "low" as you could go.
What have I got wrong?
JT - 30 Apr 2007 17:56 GMT > I had thought that bytcode was, so to speak, > the "machine code" of the Java Virtual machime. It is.
> If that were true, I can't see how there would be any room The Java Virtual Machine Specification is a precise document on the meaning of bytecodes. So JIT compilers simply attempt to produce native binaries that have the same behavior as the bytecode.
I don't see how the JVMS prevents that.
> (or any need) for further compilation of the byte code. The need is speed. Interpretation is much slower than native execution.
> But if the bytecode were just a portable abstraction, something "above" > the JVM's machine language but "below" the java source language No. The byte code is the native machine code of the JVM.
> My problem is that in my conception, when it is a question of running a > virtual machine, the "compilation" would be to that vm's "machine" > language The JIT does not compile to to the VM's machine language. In fact, the JIT always compiles to the CPU's machine language.
> and thats as "low" as you could go. Why? The JVM itself has full access to the Operating System that the JVM is running on. Whenever you have native methods (eg. some of the GUI methods and the IO methods...), the JVM will have to invoke the corresponding services from the Operating System.
So, a JVM could invoke a JIT to translate frequently-executed code into a suitable binary format that the OS can execute.
I see no problem there. (And as you noted, there are many powerful Java JIT out there)
- JT
Lee - 01 May 2007 03:59 GMT >>I had thought that bytcode was, so to speak, >>the "machine code" of the Java Virtual machime. > > It is. So I'm not completely in cloud cuckoo land. Phew!
>>If that were true, I can't see how there would be any room > > The Java Virtual Machine Specification is a precise document > on the meaning of bytecodes. So JIT compilers simply > attempt to produce native binaries that have the same behavior > as the bytecode.
> I don't see how the JVMS prevents that. See below
<snip>
> No. The byte code is the native machine code of the JVM. > [quoted text clipped - 8 lines] > > Why? Why I dont understand how you can go "lower" than the VM's machine code:
I'm handicapped by not knowing the design/architecture of the actual JVM, so even though the best way to explain my difficulty would be to do so in terms of the actual JVM instructions, I will do the next best thing, and try to do it in terms of a simpler and entirely mythical virtual machine.
Lets suppose I have a string handling virtual machine. Its got a string store and it has two native operations (among others, but we're just interested in showing why I think its not possible to get "below" the virtual machines own virtual machine language. Your mission impossible, should you choose to accept it is to expose the flaw in how I'm thinking.
The two operations are "Head" which returns the first (zeroth) character of the string, and "Tail" which returns the substring consisting of the string that remains after removing the first (Zeroth) character. Gee, sounds like Lisp car and cons, but "never mind".
So the "byte" code for "Head" is x01 and the byte code for "Tail" is x02. The implementation of the string virtual machine on a particular hardware platform consists of the native hardware machine instructions that make the internal structures of the VM (implemented of course as "real" structures built out of real memory and real registers and all that.
I suppose in one sense, you can say that the "compilation" of the byte code x01 and/or x02 is the set of machine instructions used to implement that part of the string virtual machine in the real hardware.
Compilation of the byte code would be nothing more or less than re-implementing a portion of the string virtual machine. So that makes no sense to me, as presumably you've done it right the first time when you implemented the string virtual machine for that hardware in the first place.
Are you saying that the vm is dynamically re-implementation at run time?
Another way to see my difficulty is to imagine that a virtual machine instruction changes the internal state of the virtual machine in some "Atomic" way. No single native machine instruction can do that, because the native instructions change the state of the real machine, not the state of the virtual machine. A small change in state of the virtual machine involves lots of "non atomic" changes in the state of the underlying real machine. The implementation of the virtual machine runs lots of native operating instructions to acheive that effect, but those instructions are determined when you implement the virtual machine, not dynamically at run time. Unless of course I'm all wet and what you're really doing is in fact dynamically re-writing the JVM implementation which seems a bit mind boggling to me.
>The JVM itself has full access to the Operating System > that the JVM is running on. Whenever you have native methods > (eg. some of the GUI methods and the IO methods...), the JVM > will have to invoke the corresponding services from the Operating > System. Er, yes. A fixed set of instructions determined at implementation time, for each JVM machine instruction. Or is that not so?
> So, a JVM could invoke a JIT to translate frequently-executed code > into a suitable binary format that the OS can execute. Which means that the implementation of any given Java machine language primitive is dynamically altered at run time. Eek! Can that be true?
> I see no problem there. You dont? The native hardware instructions that find the head of a string are re-invented every time somebody does the "head" operation? Can that be right?
(And as you noted, there are many
> powerful Java JIT out there) > > - JT Lew - 01 May 2007 05:00 GMT > Why I dont understand how you can go "lower" than the VM's machine code:
> Are you saying that the vm is dynamically re-implementation at run time? Yes.
> dynamically at run time. Unless of course I'm all wet and what you're > really doing is in fact dynamically re-writing the JVM implementation > which seems a bit mind boggling to me. Yes, that's what's happening.
> Er, yes. A fixed set of instructions determined at implementation time, > for each JVM machine instruction. Or is that not so? That is not so.
> Which means that the implementation of any given Java machine language > primitive is dynamically altered at run time. Eek! Can that be true? Yes.
> You dont? The native hardware instructions that find the head of a > string are re-invented every time somebody does the "head" operation? > Can that be right? No. Just when necessary to optimize the program.
 Signature Lew
Lee - 01 May 2007 21:19 GMT >> Why I dont understand how you can go "lower" than the VM's machine code: > >> Are you saying that the vm is dynamically re-implementation at run time? > > Yes. Awsome.
I kept thinking of the virtual machine as a fixed "simulation" application, written once and set in stone; but I can see how its possible to optimize whole blocks of code in ways that are not likely when considering just one primitive operation.
Wow! I'm still blown away by the concept.
Eric Sosman - 01 May 2007 06:08 GMT > [...] > [quoted text clipped - 3 lines] > you implemented the string virtual machine for that hardware in the > first place. For one thing, the virtual-to-native compilation can eliminate all the decoding of the virtual instructions. A straightforward interpreter will fetch a virtual instruction, fiddle with it for a while, and dispatch to an appropriate sequence of actual instructions that accomplish the virtual instruction's mission. It may amount to only a few masks, a few tests, and a big switch construct, but the interpreter goes through it on every virtual instruction. Once the code is compiled to native instructions, all the decoding and dispatching simply vanishes: it was done once, by the compiler, and need never be done again.
Another effect is that the virtual instructions are quite often more general than they need to be for particular uses. Stepping away from your two-instruction string machine for a moment, let's suppose you've got a virtual instruction that adds two integers to form their sum. The interpreter probably fetches operand A, fetches operand B, adds them, and stores the sum in target C. Well, the virtual-to-native compiler might "notice" that A,B,C are the same variable, which the program adds to itself in order to double it. The generated native machine code is then quite unlikely to do two fetches: one will suffice, followed by a register-to-register add or a left shift or some such. Not only that, but the compiler may further notice that C is immediately incremented after doubling, so instead of storing C and fetching it back again for incrementation, the native machine code says "Hey, I've already got it in this here register" and eliminates both the store and the subsequent fetch.
>> [...] >> So, a JVM could invoke a JIT to translate frequently-executed code [quoted text clipped - 7 lines] > string are re-invented every time somebody does the "head" operation? > Can that be right? Could be. The virtual-to-native compiler has the advantage of being able to see the context in which a virtual instruction is used, and may be able to take shortcuts, as in the instruction- combining example above. As an example of a JVM-ish application of this sort of thing, consider compiling `x[i] += x[i];', our familiar doubling example but this time with arrays. Formally speaking, each array reference requires a range check -- but the JIT may notice that if the left-hand side passes the range check, there is no need to do it a second time on the right-hand side. Even better, the JIT may notice common patterns like
for (int i = 0; i < x.length; ++i) x[i] += x[i];
... and skip the range checking entirely.
A viewpoint you may find helpful, if a little wrenching at first, is to think of the virtual instruction set as the elements of a low-level programming language. You could, with sufficient patience, write Java bytecode by hand, but it might be easier to write Java and use javac to generate bytecode from it. Either way, the bytecode is just an expression of a program, written in a formal language, and there's no reason a translator couldn't accept that formal language as its "source" for compilation.
 Signature Eric Sosman esosman@acm-dot-org.invalid
John W. Kennedy - 01 May 2007 17:37 GMT > Could be. The virtual-to-native compiler has the advantage > of being able to see the context in which a virtual instruction > is used, and may be able to take shortcuts, as in the instruction- > combining example above. It also knows /exactly/ what processor it's running on, and can take advantage of detailed timing information and new opcodes.
 Signature John W. Kennedy "But now is a new thing which is very old-- that the rich make themselves richer and not poorer, which is the true Gospel, for the poor's sake." -- Charles Williams. "Judgement at Chelmsford" * TagZilla 0.066 * http://tagzilla.mozdev.org
Kai Schwebke - 30 Apr 2007 18:12 GMT Lee schrieb:
> My problem is that in my conception, when it is a question of running a > virtual machine, the "compilation" would be to that vm's "machine" > language and thats as "low" as you could go. > > What have I got wrong? In the end the code does not run on the vm, but on the real machine. A runtime with "just in time compilation" compiles the virtual machine code to real, machine dependent code much like a compiler would do.
Kai
Christian - 30 Apr 2007 18:35 GMT Kai Schwebke schrieb:
> Lee schrieb: >> My problem is that in my conception, when it is a question of running a [quoted text clipped - 8 lines] > > Kai Is there any interpretation going on today in the jvm or is simply everything compiled to machinecode just in time before execution?
JT - 30 Apr 2007 18:39 GMT > Is there any interpretation going on today in the jvm or is simply > everything compiled to machinecode just in time before execution? Depends on the JIT. The default JIT from Sun ("HotSpot") will initially interpret the code. As the interpreter runs, HotSpot then analyzes the runtime behavior and try to identify which methods should be compiled to native code.
See this section on Sun.com: http://java.sun.com/products/hotspot/whitepaper.html#hotspot
- JT
Chris Smith - 30 Apr 2007 18:42 GMT > Is there any interpretation going on today in the jvm or is simply > everything compiled to machinecode just in time before execution? Modern JVMs do both; the performance-critical stuff is JIT'ed, but a lot of one-off initialization code will be interpreted. At one time, JIT compilers would frequently run before any code was executed; this was changed because mixed mode (some interpreting, some compiling) reduces the perceived start-up time of applications.
 Signature Chris Smith
Chris Smith - 30 Apr 2007 18:40 GMT > Perhaps someone can "debug" me on this: > [quoted text clipped - 3 lines] > byte code itelf would "drive" the VM, taking the VM from internal state > to internal state until the computation was done. Bytecode is just a file format to represent the actions of a piece of Java code. Nothing more, and nothing less. It can be used directly, or it can be converted to a different format.
Early implementations of Java interpreted it; that is, they used it directly. Actually, implementations of Java on cellular phones and other embedded devices often still do this because it's more efficient in terms of memory usage and generally no one does high-performance computation on a cell phone.
Newer Java implementations (as of 1999 or so) for desktop and server platforms rarely interpret the bytecode. They translate it into the native machine language, and let the processor run that native machine language directly. This is, obviously, much faster.
> But if the bytecode were just a portable abstraction, something "above" > the JVM's machine language but "below" the java source language, that > would create the need to compile the byte code "the rest of the way > down" to the actual jvm machine language, but NOT to native hardware > machine language. There is no JVM machine language. Perhaps what's confusing you is there is no such thing as "the" JVM. There is a JVM for x86, another for x86- 64-bit platforms, another for Sparc, and so on... There are different JVMs for different operating systems as well, though they often share most of the JIT implementation. Each of these implementations of a JVM contains its own different JIT compiler that generates code appropriate for that processor.
So in the end, the JVM for a particular platform does the transformation to the native machine language for that CPU, and from that point on it just runs the code and sits back and waits for the code to call it; essentially, after the JIT step, the JVM is essentially just a library that is called by the application code.
 Signature Chris Smith
Wojtek - 30 Apr 2007 19:12 GMT Lee wrote :
>> All other things being equal, we expect an interpreted language to run a >> bit slower than native machine code. [quoted text clipped - 38 lines] > virtual machine, the "compilation" would be to that vm's "machine" language > and thats as "low" as you could go. In a typical native environment you have:
source - what the programmer wants done object code - what the programmer wants done, but in a form the computer can understand library - how to do stuff for a particular operating system executable - what the programmer wants done along with how to do it for that OS
So the sequence is: source -> object code (compiler with optomization switches which the programmer "guesses" will make the code run faster/better)) object code + library -> executable (linker) ** the executable is distributed the user runs the executable
In Java you have:
source - what the programmer wants done bytecode - what the programmer wants done, but in a form the Java Virutal Machine can understand ** the bytecode is distributed
On the client machine, the JVM reads the byte code and since the JVM is native to that OS it knows how to do stuff
The sequence is: source -> byte code (compiler) ** the byte code is distributed the user runs the JVM pointing to the byte code (The JVM does not-the-fly optomizations depending on how THAT user uses the application) byte code + JVM
In a pure enterpreted environment (such as Perl and PHP) source - what the programmer wants done ** the source is distributed the user runs the Perl (or PHP) enterpreted pointing to the source code
This makes more sense with pretty diagrams :)
Note: Yes I know you can now get Perl and PHP linkers to produce executables.
 Signature Wojtek :-)
RedGrittyBrick - 30 Apr 2007 21:32 GMT > In a pure enterpreted environment (such as Perl and PHP) These days, things are rarely that simple. http://www.perl.com/doc/FMTEYEWTK/comp-vs-interp.html
Joshua Cranmer - 01 May 2007 22:35 GMT > Whats the current state of the art? Would we expect a java program to > run at 0.5 * the speed of C, or 0.7 or 0.9 or what? In one program contest I participate in, the Java factor is 1.5x, BUT this is considering that all code is expected to run in 1 second or less and that great emphasis is placed on optimized code.
I would expect that most applications would run at approximately native speeds.
As a side-note, said competition used to use a 5x factor (but it used 1.3)....
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|