I have a newbie sort of question about Java byte code. Will the same
source code always produce the same byte code? How about code that
differs only in comments?
derek - 08 Jan 2008 19:18 GMT
> I have a newbie sort of question about Java byte code. Will the same
> source code always produce the same byte code? How about code that
> differs only in comments?
simple enough to test. just compile a program more than once and see if the files are the same.
you could use different versions of the compiler, or from different vendors.
=====================================================
THIS IS MY SIGNATURE. There are many like it, but this one is mine.
External Concepts Guild - 08 Jan 2008 19:23 GMT
> I have a newbie sort of question about Java byte code. Will the same
> source code always produce the same byte code? How about code that
> differs only in comments?
The same source code will not always produce the same byte
code. A pair of source code that only differs in comments
may generate different byte code. It is easy to see with
examples.
Let the file test.java be
class test
{
public static void main ( String [ ] args )
{
// Silly Test
System . out . println ( "Hi!" ) ;
}
}
then javac -g test.java (generate all debugging info) will
generate different byte code than javac -g:none test.java
(generate no debugging info).
Similarly if you delete the comment, you can see that the
resulting byte code is changed.
Daniel Pitts - 08 Jan 2008 20:50 GMT
On Jan 8, 10:42 am, twopoint718 <christopher.j.wil...@gmail.com>
wrote:
> I have a newbie sort of question about Java byte code. Will the same
> source code always produce the same byte code? How about code that
> differs only in comments?
The class files themselves may be different, but I don't think the
byte-code will change. The reason the class files could change have
to do with debug and exception line-number information.
It is also possible that different versions of the Java compiler will
produce slightly different byte-code for the same source, however the
same compiler should produce the same output for the same input.
Mike Schilling - 09 Jan 2008 04:32 GMT
> It is also possible that different versions of the Java compiler
> will
> produce slightly different byte-code for the same source, however
> the
> same compiler should produce the same output for the same input.
Probably, but there are no guarantees. "Here's a really sparse switch
statement. I could produce either a tableswitch or a sequence of
if_icmpeq's. I'll randomly select one or the other." propably isn't
the world's best compiler design, but it's perfectly legal.
Lew - 09 Jan 2008 05:32 GMT
>> It is also possible that different versions of the Java compiler
>> will
[quoted text clipped - 6 lines]
> if_icmpeq's. I'll randomly select one or the other." propably isn't
> the world's best compiler design, but it's perfectly legal.
Funny, that sounds just like the design of the Java compiler - the
bytecode-to-machine-code compiler that routinely makes such decisions at
runtime, except that the choice isn't actually random.
Seems like a pretty good design - it makes for all sorts of lovely
optimizations, and perhaps more importantly, de-optimizations as the situation
demands, and has significantly improved Java's run-time performance since its
institution.
Up until now we've only talked about compilation to bytecode, which is
relatively static, and that does raise the rather interesting question. Could
the Java compiler's choice of bytecode, say for a tableswitch or not as above,
preclude the JVM's HotSpot compiler from making some wiser choices?
Or perhaps there is some flexibility even there, where the same bytecode (say
the if_icmpeq series) might turn into the local CPU equivalent of a
tableswitch due to the JVM's alertness, or not, dynamically, at different
times in the same program run.
How sophisticated are HotSpot's optimizations these days? While I'm having
some trouble getting a read on how very clever hotspotting is in real life,
compared to what the white papers say it should be, Evidence is that something
is doing something.
I modified some old Linpack benchmark code of the net the other day, to make
it run multiple times in a loop, reporting its results for each loop of, say,
100-by-100 matrix calculations. Over ten iterations the reported speed
improved step by step from "20" to "686", using whatever the program thinks it
measures.
Setting -client instead of -server seems to dampen that a little. I'm
guessing that the acceleration is the result of hotspotting, but I haven't
ruled out cache or other effects yet. It's possible the optimizer lifted out
entire loop bodies due to lack of side effects, thus both validating and
invalidating the benchmark results.

Signature
Lew
Mark Space - 08 Jan 2008 20:53 GMT
> I have a newbie sort of question about Java byte code. Will the same
> source code always produce the same byte code? How about code that
> differs only in comments?
All other things being equal, the same source should always produce the
same byte codes. However as pointed out above, compiler options will
change this, and some compiler options retain comments in the bytecodes.
(I think that's what he's saying.)
Also, a newer version of a compiler might produce different (differently
optimized) code, and therefore different bytecodes are emitted.
Normally this is a good thing.
However, if for example, you feel you must keep old bytecodes around to
debug a problem exactly as it appears in a customer's environment, or
for legal reasons perhaps, you'll have to save each binary you release.
Some source code control systems can do this, but don't forget about
off-site back-ups too.
Mark Thornton - 08 Jan 2008 21:14 GMT
> I have a newbie sort of question about Java byte code. Will the same
> source code always produce the same byte code? How about code that
> differs only in comments?
The names used for synthetic methods depend on the version (and origin)
of the compiler. This affects the serialVersionID computed for the class.
Mark Thornton
Lew - 09 Jan 2008 04:27 GMT
>> I have a newbie sort of question about Java byte code. Will the same
>> source code always produce the same byte code? How about code that
>> differs only in comments?
>
> The names used for synthetic methods depend on the version (and origin)
> of the compiler. This affects the serialVersionID computed for the class.
One reason why Joshua Bloch and others recommend not leaving it up to the
compiler to determine the serialVersionUID.

Signature
Lew
Joshua Cranmer - 08 Jan 2008 22:20 GMT
> I have a newbie sort of question about Java byte code. Will the same
> source code always produce the same byte code? How about code that
> differs only in comments?
There are three things that influence the bytecode:
1. The source
2. The compiler
3. The options used to compile
If any of those change, the bytecode may change. Although I am not an
expert, I will attempt to explain when changes to one may change the
bytecode:
1. A change in comments may or may not have a different bytecode. I can
think of two cases that would cause the bytecode to change:
a. A '@deprecated' is added to a documentation comment.
b. The comment changes the line count and line numbers are pushed to
the bytecode (which is the default).
Other circumstances:
c. Other generic line count changes.
d. A change in the token stream, excluding variable name changes if
and only if -g is not used.
2. The compiler changes depend on the two compilers involved. A short
summary:
a. A change in the lowest release number should not change the
bytecode, unless a compiler bug is involved.
b. Most changes between two major-number releases should not matter:
1) 1.4 -> 1.5 will change the code if a class literal is used or
try-catch-finally is used.
2) Synthetic variable names may change between releases.
c. All major releases will change the major version number and
potentially the minor version number in the class file (bytes 7-8 and
5-6, respectively). The actual difference between these class file
versions is, for the most part, nonexistent.
3. -g will change the amount of information reified to the code, and
-target will change version numbers. Those are the only two I can come
up with off the top of my head.

Signature
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
Andrew Thompson - 09 Jan 2008 05:59 GMT
...
>There are three things that influence the bytecode:
...
>2. The compiler changes depend on the two compilers involved. A short
>summary:
> a. A change in the lowest release number should not change the
>bytecode, unless a compiler bug is involved.
Yes, it might. Code compatible with 1.3, if later compiled using the
javac from 1.4, might end up with 1.4 method calls in the binary.
Using he -bootclasspath at compile time (pointing to an rt.jar of
suitable vintage), is the only way to guarantee that will not happen.

Signature
Andrew Thompson
http://www.physci.org/
Lew - 09 Jan 2008 06:07 GMT
Joshua Cranmer wrote:
> ...
>> There are three things that influence the bytecode:
[quoted text clipped - 3 lines]
>> a. A change in the lowest release number should not change the
>> bytecode, unless a compiler bug is involved.
> Yes, it might. Code compatible with 1.3, if later compiled using the
> javac from 1.4, might end up with 1.4 method calls in the binary.
> Using he -bootclasspath at compile time (pointing to an rt.jar of
> suitable vintage), is the only way to guarantee that will not happen.
Strictly speaking, that's a library incompatibility, not a bytecode
incompatibility. The bytecode to call a non-existent method is pretty much
the same as the bytecode to call one that does exist. It's the results that
differ.
Of course, it's also the results that count.

Signature
Lew
Is that another nit? Magnifying glass! Tweezers! No, the small ones!
Joshua Cranmer - 09 Jan 2008 22:02 GMT
> ..
>> There are three things that influence the bytecode:
[quoted text clipped - 8 lines]
> Using he -bootclasspath at compile time (pointing to an rt.jar of
> suitable vintage), is the only way to guarantee that will not happen.
I was talking about the difference between (say) 1.6.0_01 and 1.6.0_02,
as well as 1.6.0 -> 1.6.1 (internal version numbers here).

Signature
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
Roedy Green - 09 Jan 2008 04:29 GMT
On Tue, 8 Jan 2008 10:42:42 -0800 (PST), twopoint718
<christopher.j.wilson@gmail.com> wrote, quoted or indirectly quoted
someone who said :
>I have a newbie sort of question about Java byte code. Will the same
>source code always produce the same byte code? How about code that
>differs only in comments?
yes, if it is compiled with the same compiler. Comments don't matter.
Neither do excess import statements.
Try the experiment. The only exception I could imagine would be some
sort of compile date embedded in the class file. I don't know if
there is one. You can find out by compiling twice and comparing.

Signature
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com