Java Forum / General / May 2005
JVM optimization
ilkinulas - 09 May 2005 09:12 GMT Hi, two functions test1 and test2 does the same thing but test2 performs nearly 20 times better than test1. JVM is unable to optimize the code in test1. is there a way to tell the java virtual machine to do this kind of optimization at compile time or runtime? NOTE : we are using Log4J and java version "1.4.2_02" private static final Logger log = Logger.getLogger(DebugTest.class);
--------------------------------------------------------------------------
public void test1() { for (int i = 0; i < 10000000;i++) { String s ="test"+i; if(log.isDebugEnabled()) { log.debug(s); } } }
public void test2() { for (int i = 0; i < 10000000;i++) { if(log.isDebugEnabled()) { String s ="test"+i; log.debug(s); } } } --------------------------------------------------------------------------
Thomas Schodt - 09 May 2005 10:17 GMT > two functions test1 and test2 do almost
> the same thing but test2 performs > nearly 20 times better than test1. JVM is unable to optimize the code > in test1. is there a way to tell the java virtual machine to do this > kind of optimization at compile time or runtime? You can tell the programmer to read the docs <http://logging.apache.org/log4j/docs/manual.html#performance>
bugbear - 09 May 2005 10:22 GMT > Hi, > two functions test1 and test2 does the same thing but test2 performs > nearly 20 times better than test1. What - even if if(log.isDebugEnabled()) returns true?
> JVM is unable to optimize the code > in test1. is there a way to tell the java virtual machine to do this [quoted text clipped - 22 lines] > } > -------------------------------------------------------------------------- The compiler has no way of knowing how "likely", let alone constant, the result of method log.isDebugEnabled().
The reason *you* can optimise this is because you have knowledge the compiler does not.
BugBear
Chris Uppal - 09 May 2005 11:49 GMT > The compiler has no way of knowing how "likely", let alone > constant, the result of method log.isDebugEnabled(). > > The reason *you* can optimise this is because you have > knowledge the compiler does not. This isn't quite entirely correct, although it's probably close enough for the OP's purposes.
The runtime JITer /could/, in theory, analyse log.isDebugEnabled() and determine that it would always return false (E.g. if it inlined it to an access of a static boolean field that was declared final, or which it could "see" was never written to). In that case, and if it could further determine that "s" was not used elsewhere, and that the StringBuilder manipulations involved in "test"+i had no side-effects, then it would be justified in removing that code.
The "server" JVM from Sun is certainly capable of performing that /kind/ of optimisation. I don't know whether it would actually do so in this particular case.
Note, BTW, that if this code was compiled with a version of javac before 1.5, or compiled for a pre-1.5 platform, then "test"+i would be compiled into StringBuffer manipulation, rather than StringBuilder. In that case it is /not/ true that "test"+1 has no side-effects (since it involves crossing a synchronisation barrier), so I wouldn't expect the JITer to be able to remove it (unless it was buggy, or "knew" that synchronisation barriers didn't matter for the particular code it generated for the particular machine it was running on).
To the OP: in general there is no way of telling the runtime that you want it to perform any particular optimisation -- other than doing it yourself by changing the code. Any particular JVM implementation /may/ have options to do so, but I don't know of any, and in any case it would be extremely obscure, and probably unsupported.
-- chris
Lee Fesperman - 09 May 2005 22:54 GMT > Note, BTW, that if this code was compiled with a version of javac before 1.5, > or compiled for a pre-1.5 platform, then "test"+i would be compiled into [quoted text clipped - 4 lines] > for the particular code it generated for the particular machine it was running > on). A smart JIT could know that the StringBuffer object was local and its reference wasn't passed to an external method. I would guess JIT would do 'variable' usage analysis of this type. The machine architecture wouldn't matter. I know of one JIT that discovers if a reference's lifetime is local and allocates it on the stack.
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
Chris Uppal - 10 May 2005 11:21 GMT > > Note, BTW, that if this code was compiled with a version of javac > > before 1.5, [quoted text clipped - 12 lines] > A smart JIT could know that the StringBuffer object was local and its > reference wasn't passed to an external method. Doesn't make any difference -- entering or leaving a synchronised block has a /global/ effect, and therefore cannot be optimised away.
(Unless, as I said, the JITer knows that the "global effect" is in fact zero for that particular machine architecture and code-generation strategy -- which may be the case, but which is not true in general.)
-- chris
Lee Fesperman - 10 May 2005 21:06 GMT > > > Note, BTW, that if this code was compiled with a version of javac > > > before 1.5, or compiled for a pre-1.5 platform, then "test"+i would [quoted text clipped - 11 lines] > Doesn't make any difference -- entering or leaving a synchronised block has a > /global/ effect, and therefore cannot be optimised away. You lost me here. How does synchronizing on a truly 'local' object have a global effect? No other thread could possibly synchronize on the object. As I mentioned, the object could even be on the stack.
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
Mark Thornton - 10 May 2005 21:31 GMT >>>>Note, BTW, that if this code was compiled with a version of javac >>>>before 1.5, or compiled for a pre-1.5 platform, then "test"+i would [quoted text clipped - 15 lines] > No other thread could possibly synchronize on the object. As I mentioned, the object > could even be on the stack. Synchronizing causes any values 'cached' in thread local memory to be written to main memory. It also invalidates any values previously read from main memory --- they have to be reread in case they have changed. This effect is not limited to the object used for synchronization.
Mark Thornton
Lee Fesperman - 10 May 2005 22:05 GMT > >>>>Note, BTW, that if this code was compiled with a version of javac > >>>>before 1.5, or compiled for a pre-1.5 platform, then "test"+i would [quoted text clipped - 20 lines] > from main memory --- they have to be reread in case they have changed. > This effect is not limited to the object used for synchronization. Sure, but how does optimizing that away invalidate the correctness of the execution here? The memory barrier would only be important for values in the object being synchronized, which no other thread could see or affect.
In the case we're discussing, the 1.5 compiler actually does optimize it away ;^)
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
Mark Thornton - 10 May 2005 22:46 GMT > Sure, but how does optimizing that away invalidate the correctness of the execution > here?
> The memory barrier would only be important for values in the object being > synchronized, which no other thread could see or affect. This is not true. Memory barriers affect all values, otherwise the common technique of synchronizing on a 'lock' object (which is often just an instance of Object) wouldn't be valid.
> In the case we're discussing, the 1.5 compiler actually does optimize it away ;^)
The spec doesn't require the use of StringBuffer to implement string concatenation, so it legitimate for it to be replaced by StringBuilder. So any code relying on the use of StringBuffer (and its synchronization is the only effect that might be visible) would be invalid. However if the use of StringBuffer had remained, it would be extremely difficult for a JIT to correctly remove it as it would have to prove that the memory barrier was not required.
Mark Thornton
Lee Fesperman - 10 May 2005 23:35 GMT > > Sure, but how does optimizing that away invalidate the correctness of the execution > > here? [quoted text clipped - 5 lines] > common technique of synchronizing on a 'lock' object (which is often > just an instance of Object) wouldn't be valid. Ok, I wasn't considering that side effect.
> > In the case we're discussing, the 1.5 compiler actually does > > optimize it away ;^) [quoted text clipped - 6 lines] > for a JIT to correctly remove it as it would have to prove that the > memory barrier was not required. Right, because of the nature of bytecode, the JIT couldn't know if this was an explicit or implicit use of concatenation, thus it couldn't tell if it were being used specifically because of this side effect. It would have to prove there was nothing affected by the memory barrier.
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
Lee Fesperman - 11 May 2005 06:49 GMT > > > Sure, but how does optimizing that away invalidate the correctness > > > of the execution here? [quoted text clipped - 7 lines] > > Ok, I wasn't considering that side effect. Oops, I'm going to have to take that back! I shoulda trusted my original intuition ;^)
Yes, the normal memory barrier does do this. However, it is inappropriate to depend on those effects outside of the synchronized section. The only variables referenced in the synchronized section we were discussing (those in the 'local' copy of StringBuffer) can never be accessed externally. It would be incorrect to assume the synchronization affects any other external variables. Besides, depending this effect on variables *outside* the synchronization would be non-deterministic ... a 'was the variable changed before the synchronization or after?' kind of situation.
I can envision a JIT that would determine all external variables accessed while synchronized on a specific reference and would only apply the memory barrier to those. IMO, that type of optimization should be allowed.
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
John C. Bollinger - 11 May 2005 15:04 GMT >>>>The memory barrier would only be important for values in the object being >>>>synchronized, which no other thread could see or affect. [quoted text clipped - 11 lines] > synchronized section we were discussing (those in the 'local' copy of StringBuffer) can > never be accessed externally. The only question of variable scope that is relevant is whether a variable is local (to a method) or not (in which case it is "shared"). Whether or not a shared variable is referenced in the synchronized section does not affect the fact that the thread needs to reload it from main memory the next time its value is used after the barrier, whether that happens to occur inside the synchronized section or not. This can result in the thread "seeing" a different value for *any* shared variable after the barrier than it did before the barrier.
> It would be incorrect to assume the synchronization > affects any other external variables. Yes, but more to the point, it is also incorrect to assume that the synchronization *does not* affect any particular shared variable. That's why the compiler cannot remove it, and why it's non-trivial for the JIT to remove it.
> Besides, depending this effect on variables > *outside* the synchronization would be non-deterministic ... a 'was the variable changed > before the synchronization or after?' kind of situation. Multi-threaded programming *is* nondeterministic. The point of synchronization is to constrain the possible global sequence of events, but you cannot make it totally deterministic and retain "simultaneous" execution. Chapter 17 of the JLS (second edition) is entirely devoted to this topic.
> I can envision a JIT that would determine all external variables accessed while > synchronized on a specific reference and would only apply the memory barrier to those. > IMO, that type of optimization should be allowed. That would not be sufficient to comply with the language's requirements. The JIT conceivably could perform an analysis proving that it could remove the barrier, but it would be considerably more complicated than you suggest, as it would involve visibility and use analysis for all variables visible to the thread at the barrier, relative to *all other live threads*.
Lee, you seem to have some ideas about synchronization that are inconsistent with the Java specs (or perhaps I'm totally wrong, but either way ...), I suggest you read over JLS(2e).17 to see whether you think it supports your position.
 Signature John Bollinger jobollin@indiana.edu
Chris Uppal - 11 May 2005 16:23 GMT > The JIT conceivably could perform an analysis proving that it could > remove the barrier, but it would be considerably more complicated than > you suggest, as it would involve visibility and use analysis for all > variables visible to the thread at the barrier, relative to *all other > live threads*. Or, as I suggested above but didn't expand on, the JIT might know that the synchronisation barriers had no effect.
If it was running on a machine/architecture/mode where the hardware provided full synchronisation between the memory seen by different processors (e.g. a single processor box ;-) then there'd be no need to worry about hardware synchronisation. If it also knew that it's own code-generation strategy didn't involve caching "global" data in thread-local store (such as the threads' stacks) then it would not need to worry about flushing that data into main store. I suppose it would still have to ensure that registers were flushed back to the stack/main-memory, but that can be achieved with a purely local analysis.
-- chris
Lee Fesperman - 11 May 2005 19:04 GMT > >>>>The memory barrier would only be important for values in the object being > >>>>synchronized, which no other thread could see or affect. [quoted text clipped - 16 lines] > result in the thread "seeing" a different value for *any* shared > variable after the barrier than it did before the barrier. Yes, only "shared" variables need be considered. I was making a distinction between shared variables that are accessed within the synchronized section and those that are not. To clarify, "synchronized section" refers to *any* code synchronized by a specific object reference. It is that distinction between shared variables that I am referring to.
I understand that the definition of the memory barrier makes no such distinction between shared variables. My point is that section of the JLS, as should other sections, should be read as a conceptual description ... an "as if" description. Properly written code should not depend on this being explicitly true, for the reasons I described below.
> > It would be incorrect to assume the synchronization > > affects any other external variables. [quoted text clipped - 3 lines] > That's why the compiler cannot remove it, and why it's non-trivial for > the JIT to remove it. Obviously, I disagree with your assertion that it would be incorrect (for JIT) to assume that synchronization does not affect "unreferenced" shared variables. Any user code that depends on this would be improper ... you seem to agree with that assertion. Given that agreement, why would you object to a JIT optimizing that situation?
> > Besides, depending this effect on > > variables *outside* the synchronization would be non-deterministic ... a [quoted text clipped - 6 lines] > execution. Chapter 17 of the JLS (second edition) is entirely devoted > to this topic. Proper multithreading is not nondeterministic where it matters. Optimal multithreading is concerned with constraining local sequences of events. Constraining global sequences is sub-optimal.
> > I can envision a JIT that would determine all external variables accessed > > while synchronized on a specific reference and would only apply the [quoted text clipped - 11 lines] > either way ...), I suggest you read over JLS(2e).17 to see whether you > think it supports your position. Ok, John. I will try to find the time to check that section of the JLS. However, I do believe many areas of the JLS do need to be read as 'conceptual' in nature rather than as absolute rules ... that a JIT following the spirit of the rules is sufficient.
I will also check with an associate who is developing a super, high-performance JVM for his opinion on the matter.
This reminds me of a discussion I had with Dale King (funny that you are both from Indiana) on this forum about a JIT aggressively reducing the reachability of references within a method even though conceptually the reference was reachable until the end of the method. This also conflicted with an explicit interpretation of the JLS, but I felt this type of aggressive optimization was correct, even required. My aforementioned JVM colleague agreed with me.
I do urge you to keep an open mind about this discussion. Others haven't ;^)
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
Mark Thornton - 11 May 2005 20:43 GMT > Ok, John. I will try to find the time to check that section of the JLS. However, I do > believe many areas of the JLS do need to be read as 'conceptual' in nature rather than > as absolute rules ... that a JIT following the spirit of the rules is sufficient. They have recently gone to a lot of trouble to rewrite all of this (http://www.jcp.org/en/jsr/detail?id=133). The new specification means exactly what it says, neither more nor less. So while relying on the synchronization side effect of a StringBuffer method would be bad style, it is legitimate Java.
Regards Mark Thornton
Lee Fesperman - 13 May 2005 23:48 GMT > > Ok, John. I will try to find the time to check that section of the JLS. > > However, I do believe many areas of the JLS do need to be read as [quoted text clipped - 6 lines] > synchronization side effect of a StringBuffer method would be bad style, > it is legitimate Java. I view it as Very Bad style. It results in obtuse, unmaintainable code.
One should only assume the basics about synchronization. Synchronization on a specific object reference should protect access and update to shared variables (and resources) as long as all updaters to those variables only do it while synchronized on the same object reference. Outside of the synchronized block all bets are off (ignoring volatile for the moment). This also conforms to the general usage of synchronized sections in all software, not just Java.
Anything else is wrong; no matter what the specs say. I haven't seen justification for any other use of synchronization. And, I think this is important to allow optimizers to do as good a job as they can. Cliff Click (I hope you read his comments) agrees with me.
Specifically, Cliff said that when an object is known to be local it is proper for the compiler to optimize away synchronization on that object. I also asked him (privately) whether a compiler could limit fencing to only those variables that it knows are referenced in any and all code synchronized under a specific (shared) object reference. He agreed that it was acceptable but noted that fencing was all-or-nothing on modern hardware. I believe there are some hardware configurations and operating systems where this (all-or-nothing fencing) is not true, thus limited fencing would be appropriate there.
Simply, I want to give the optimizers every technique they can have to do a better job. It's a vital issue to me, similar to my stance in the earlier reachability discussion with Dale King.
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
John C. Bollinger - 11 May 2005 21:04 GMT > I understand that the definition of the memory barrier makes no such distinction between > shared variables. My point is that section of the JLS, as should other sections, should > be read as a conceptual description ... an "as if" description. Properly written code > should not depend on this being explicitly true, for the reasons I described below. I am not arguing against interpreting the JLS to allow "as if" implementations. I am arguing, however, that the JLS specifies synchronization semantics that affect shared variables that are referenced outside synchronized blocks, and that this is intentional and consistent.
>>Yes, but more to the point, it is also incorrect to assume that the >>synchronization *does not* affect any particular shared variable. [quoted text clipped - 5 lines] > depends on this would be improper ... you seem to agree with that assertion. Given that > agreement, why would you object to a JIT optimizing that situation? If a shared variable has not yet been read by the thread at the memory barrier then it does not need to be considered. If there is no possible execution path by which the thread at the barrier could use the current value of a shared variable before coming to another barrier then that variable does not need to be considered. If that's what you mean by "unreferenced" then maybe we agree after all.
If, however, the thread has previously read the value of a shared variable, and if there is a possible sequence of execution by which the thread can later use the value of that variable -- whether inside the synchronized block or after exiting it -- then the thread must reload the value from main memory before that use (or prove, through program analysis or other means, that the value cannot have changed between the previous read and encountering the barrier). I think the JLS is pretty clear on that, but you may disagree.
As Chris Uppal points out, reloading a shared variable's value might reliably be a no-op in some cases. If the JIT recognizes that then its job is considerably easier and its likelihood of removing the barrier altogether is enhanced. There may be a large number of variables that need to be analyzed, however (I claim), so in other cases the likelihood of JIT removing the barrier is reduced, even if doing so would be allowed.
>>> Besides, depending this effect on >>>variables *outside* the synchronization would be non-deterministic ... a [quoted text clipped - 10 lines] > is concerned with constraining local sequences of events. Constraining global sequences > is sub-optimal. I may be confusing myself by talking on multiple levels at the same time. At the conceptual level of JLS' discussion of synchronization semantics in chapter 17, synchronization is very much about constraining the possible global sequence of all threads locking and unlocking locks and loading and storing values from / to "main memory", and also about constraining the local sequence of each thread's use and assignment of values relative to its own sequence of the aforementioned locking, unlocking, loading, and storing. This constrains what values for shared variables a thread can see, which in turn can affect the sequence of other local operations that each thread will perform.
> Ok, John. I will try to find the time to check that section of the JLS. However, I do > believe many areas of the JLS do need to be read as 'conceptual' in nature rather than > as absolute rules ... that a JIT following the spirit of the rules is sufficient. I think we agree here, although "the spirit of the rules" is a bit too vague for me. I don't have a problem with JIT being able to use its knowledge of the VM implementation and its vantage point over the programs running in it to provide operation "as if" it were following the details of the spec, while actually taking shortcuts. There must be no possibility of producing behavior that is non-compliant with the spec, however.
Our disagreement seems to come down to exactly what the spec does and does not require, and it may be colored, on one side or the other or both, by assumptions about the platform and environment in which the VM runs.
> I will also check with an associate who is developing a super, high-performance JVM for > his opinion on the matter. By all means, provided that the argument is based on the language specs. If your associate is developing a JVM then he must be deeply familiar with the VM specs, which mirror the language specs in this area.
> This reminds me of a discussion I had with Dale King (funny that you are both from > Indiana) on this forum about a JIT aggressively reducing the reachability of references > within a method even though conceptually the reference was reachable until the end of > the method. This also conflicted with an explicit interpretation of the JLS, but I felt > this type of aggressive optimization was correct, even required. My aforementioned JVM > colleague agreed with me. Yes, I remember that. As I recall, I came down on your side on that one, though I don't remember whether I took much part in the discussion.
> I do urge you to keep an open mind about this discussion. Others haven't ;^) I think I am. I am prepared to be wrong, or only partially right, but you're going to need to persuade me. The language specs are the only sound basis for the discussion, however, which is why I asked you to consider them. I fear there is too much chance of us talking past each other otherwise.
 Signature John Bollinger jobollin@indiana.edu
Lee Fesperman - 11 May 2005 22:09 GMT > > I will also check with an associate who is developing a super, > > high-performance JVM for his opinion on the matter. > > By all means, provided that the argument is based on the language specs. > If your associate is developing a JVM then he must be deeply familiar > with the VM specs, which mirror the language specs in this area. John, Mark:
I pointed my associate to this subthread, and he asked that I post the following for him:
====================================================================================
[Who am I? I was the Chief Architect of HotSpot's server compiler, worked on it for years, and was one of the outside experts on the Java Memory Model JSR; I implemented most of the 'correctness' part of the new JMM in HotSpot.]
Three issues:
(1) HotSpot 1.4.2 doesn't currently optimize String s = "test"+i; if( blah ) ...use s... Some other JVMs will; try BEA's in particular. Or maybe HotSpot 1.5.0 client. This needs "escape analysis" which will be in HotSpot (client & server) Real Soon Now.
(2) Locking changed in the New Java Memory Model in 1.5. But NOBODY honored the old memory model (which was screwed in many ways). So it's legal in 1.5 to not bother locking/fencing on thread-local objects. It's practical to do it in 1.4.2, because the memory model is so wedged already, who's gonna be able to tell?
(3) When a lock changes threads, this forces synchronization of ALL java values, not just those in the locked object. Otherwise, as Mark Thornton points out, sync'ing on a j.l.Object would be useless. Note that if a lock never changes hands, no synchronization is required.
 Signature Cliff Click
Lee Fesperman - 15 May 2005 00:22 GMT > > I understand that the definition of the memory barrier makes no such > > distinction between shared variables. My point is that section of the [quoted text clipped - 7 lines] > referenced outside synchronized blocks, and that this is intentional and > consistent. That begs my final question. Can you give any justification for writing code that depends on that behavior? In my reply to Mark, I outlined the only proper (IMO) use of synchronized sections. Do you see any other uses?
> > Obviously, I disagree with your assertion that it would be incorrect > > (for JIT) to assume that synchronization does not affect "unreferenced" [quoted text clipped - 17 lines] > previous read and encountering the barrier). I think the JLS is pretty > clear on that, but you may disagree. As above, I'm reaching further than that. I'm limiting things to shared variables accessed in the synchronized section. So, yes I do disagree and published Cliff's comments in support of that.
> As Chris Uppal points out, reloading a shared variable's value might > reliably be a no-op in some cases. If the JIT recognizes that then its > job is considerably easier and its likelihood of removing the barrier > altogether is enhanced. There may be a large number of variables that > need to be analyzed, however (I claim), so in other cases the likelihood > of JIT removing the barrier is reduced, even if doing so would be allowed. As I mentioned, 'super' JVMs exist that are very likely to do that type of analysis. I do not wish to inhibit their optimization techniques.
> >>> Besides, depending this effect on > >>>variables *outside* the synchronization would be non-deterministic ... a [quoted text clipped - 21 lines] > variables a thread can see, which in turn can affect the sequence of > other local operations that each thread will perform. My point is that this may be limited to a subset of all threads by other constraints.
> > I do believe many areas of the JLS do need to be read as 'conceptual' > > in nature rather than as absolute rules ... that a JIT following the [quoted text clipped - 7 lines] > no possibility of producing behavior that is non-compliant with the > spec, however. I wish it were so simple, but slavish compliance with the spec to support what is clearly bad practice (IMO) or completely unnecessary is detrimental to efficient JVMs. In the reachability discussion with Dale King, I mentioned a compiler being able (after proper analysis) to allocate objects on the stack. This is directly contrary to the spec which states that *all* objects are allocated on the heap. Of course that case, who would know?
> Our disagreement seems to come down to exactly what the spec does and > does not require, and it may be colored, on one side or the other or [quoted text clipped - 7 lines] > If your associate is developing a JVM then he must be deeply familiar > with the VM specs, which mirror the language specs in this area. I hope you read Cliff Click's comments and accept his credentials.
> > This reminds me of a discussion I had with Dale King (funny that you > > are both from Indiana) on this forum about a JIT aggressively reducing [quoted text clipped - 6 lines] > Yes, I remember that. As I recall, I came down on your side on that > one, though I don't remember whether I took much part in the discussion. I dunno. I just remember the essential part of discussion coming down to me and Dale, with others dropping out.
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
John C. Bollinger - 16 May 2005 16:27 GMT >>I am not arguing against interpreting the JLS to allow "as if" >>implementations. I am arguing, however, that the JLS specifies [quoted text clipped - 5 lines] > depends on that behavior? In my reply to Mark, I outlined the only proper (IMO) use of > synchronized sections. Do you see any other uses? No, I don't. I agree that you have captured the *purpose* of synchronization fully and succinctly. But good style has never been the point of this debate, at least to me. Neither have questions of how Java should have been designed in this area, or how other languages work. I have attempted to stick to my point that the synchronization semantics defined by the Java language specification make it difficult to determine the correctness of optimizing away synchronized sections at runtime, and make it flatly incorrect to optimize them away at compile time. This means that the specs place requirements that go beyond the intended purpose of synchronization (as you and I see it), but that isn't any justification for ignoring those parts of the spec that we don't like.
>>>Obviously, I disagree with your assertion that it would be incorrect >>>(for JIT) to assume that synchronization does not affect "unreferenced" [quoted text clipped - 21 lines] > accessed in the synchronized section. So, yes I do disagree and published Cliff's > comments in support of that. My reading of Cliff's comments does not support your position. In particular, he wrote "When a lock changes threads, this forces synchronization of ALL java values." (Emphasis from the original.) He also wrote "It's practical to [not lock/fence on thread-locals] in 1.4.2, because the memory model is so wedged already, who's gonna be able to tell?" The former seems to underscore the JLS interpretation that I have been insisting upon. The latter doesn't support the *correctness* of the potential optimization; rather, I'd paraphrase his comment as "technically it's wrong, but who cares?" Perhaps his comment expresses the same idea presented this may in the introduction to the JSR-133 production spec: "The existing chapters of the JLS [second edition] and JVMS specify semantics that are at odds with optimizations performed by many existing JVMs." I am not prepared to accept an "optimization" that may alter program behavior in an externally visible way that is inconsistent with the requirements of the spec. Such a thing does not match my definition of the term. I am also not prepared to accept the proposed optimization on the basis that "everybody does it" or that VM compliance is routinely bad in this general area. You were concerned about program nondeterminism earlier in the discussion -- surely throwing additional uncertainty into the situation is not helpful in that regard.
As long as we are appealing to experts, I offer this excerpt from section 6.6 of the JSR-133 production spec: "A synchronization action is useless in a number of situations, including lock acquisition on thread-local objects or the reacquisition of a lock on an object on which a thread already has a lock. A number of papers have been published showing compiler analyses that can detect and remove useless synchronization. *The old JMM did not allow useless synchronization to be completely removed*; the new JMM does." (Emphasis mine.) This seems consistent with Cliff's comments, and the emphasized portion is precisely what I have been arguing.
>>As Chris Uppal points out, reloading a shared variable's value might >>reliably be a no-op in some cases. If the JIT recognizes that then its [quoted text clipped - 5 lines] > As I mentioned, 'super' JVMs exist that are very likely to do that type of analysis. I > do not wish to inhibit their optimization techniques. If the VM can prove that the optimization is correct, then it can perform it. I have never claimed otherwise. I have claimed that the proof is very difficult in the general case, and I have also claimed that the proof is impossible for the compiler to perform. The new Java memory model appears to make the required proof easier in some cases (particularly the thread-local case), but as far as I can tell it doesn't make the proof any easier for some other cases.
>>>Proper multithreading is not nondeterministic where it matters. Optimal >>>multithreading is concerned with constraining local sequences of events. >>>Constraining global sequences is sub-optimal.
> My point is that this may be limited to a subset of all threads by other constraints. Agreed, but that doesn't address the general question. In some cases the VM may be more easily able to prove the correctness of the optimization than in others, but that doesn't change the fact that in it's full generality, the proof is hard. More to the point, under the Java Memory Model before 1.5, the synchronization object being thread-local does not place such a constraint as you describe.
>>>I do believe many areas of the JLS do need to be read as 'conceptual' >>>in nature rather than as absolute rules ... that a JIT following the [quoted text clipped - 10 lines] > I wish it were so simple, but slavish compliance with the spec to support what is > clearly bad practice (IMO) or completely unnecessary is detrimental to efficient JVMs. It seems we may have a fundamental disagreement here, in which case I'll agree to disagree. I assert that spec compliance -- inasmuch as it is possible to observe and test it -- is essential for predictable program behavior. I don't care how fast a VM is if I cannot rely on it behaving correctly, and the specification is the final arbiter of correctness.
> In the reachability discussion with Dale King, I mentioned a compiler being able (after > proper analysis) to allocate objects on the stack. This is directly contrary to the spec > which states that *all* objects are allocated on the heap. Of course that case, who > would know? A program that required a lot of stack might know, but let's not get off on that discussion. I agree in principle that if it's impossible for the difference to be observed, directly or indirectly, then it's OK for there to be a difference. I maintain, however, that it would be possible for a program to behave differently in an otherwise conforming VM that performed your synchronization optimization than it ever could in a fully conforming VM (at least under the old memory model). It is on that basis that I argue that the VM must prove the correctness of the optimization before performing it, which then precludes the compiler from ever performing it.
> I hope you read Cliff Click's comments and accept his credentials. I did read his comments. I have no argument with his assertions regarding various VM implementations, and I don't find him particularly at odds with my position on the requirements of the spec, but I do not agree with his apparent position on spec compliance. I accept him as an authority on the VM spec, on Java optimization, and on VM implementation, and I don't doubt his expert knowledge of VM-level performance trade offs inherent in adherence to specific details of the spec. I don't (yet) see any reason to grant special weight to his opinion on general the advantages and disadvantages of compliance, however, at least to the extent that I will back off my assertion that it is unacceptable for an optimization to make program behavior non-compliant in an observable way.
 Signature John Bollinger jobollin@indiana.edu
Lee Fesperman - 16 May 2005 22:15 GMT > > ... Can you give any justification for writing code that depends on > > that behavior? In my reply to Mark, I outlined the only proper (IMO) [quoted text clipped - 12 lines] > isn't any justification for ignoring those parts of the spec that we > don't like. Thanks. I wanted to clarify that point. Since I would agree with anyone who says I have a tendency to (over-) beat a dead horse as it were, I will attempt to distill the technical point of contention. I'll use as justification for continuing Chris Smith's recent indication on another thread that an interesting discussion has its own merit.
I'll drop other aspects of our discussion as 'agree to disagree'.
I perceive that you, Chris Uppal and Mark have a common point as regards the memory barrier (fence) of synchronization. I'll use your previous posting as the basis for responding to that point ...
> >>If, ..., the thread has previously read the value of a shared > >>variable, and if there is a possible sequence of execution by which the [quoted text clipped - 4 lines] > >>previous read and encountering the barrier). I think the JLS is pretty > >>clear on that, but you may disagree. I will simplify this to reduce details (hopefully without loss). Let us assume that the value of a shared variable is read into a temporary location, such as register, by a given thread before the synchronized block is entered. There may be other mechanisms at work, but I'm simplifying.
First, let me say that I agree that the value must be re-read in the synchronized block (memory barrier), *if accessed in the block*. The issue I'm interested in is whether the value must be refreshed (as part the memory barrier) if the value is used *after* the synchronized block (but not in the block).
Earlier in the thread, I asserted that at the end of the synchronized block another thread may, in a synchronized block, change the value of the shared variable. This means that the register value of the original thread would be 'stale', maybe not as stale as before but still stale. Since this action is nondetermistic, it would seem that we are arguing degrees of staleness. This leads to my feeling that this concern is 'academic' rather than realistic.
I have to tried to think this out throughly, so let me posit a counter to my argument. In the synchronized block, the original thread could set a shared flag whose purpose is to tell all others not to change the other shared value. In that case, the original thread could be assured that its register value was not stale. However, I do view this as an extreme case.
If you guys feel that 'degrees' of staleness are important or that my extreme case must be catered to, I guess we will have to agree to disagree and end this discussion.
OTOH, if I've missed some detail in the above discussion, I would love to hear about it.
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
John C. Bollinger - 17 May 2005 00:56 GMT > Thanks. I wanted to clarify that point. Since I would agree with anyone who says I have > a tendency to (over-) beat a dead horse as it were, I will attempt to distill the > technical point of contention. I'll use as justification for continuing Chris Smith's > recent indication on another thread that an interesting discussion has its own merit. Sounds like as good an excuse as any :-)
> I'll drop other aspects of our discussion as 'agree to disagree'. Very well.
> I perceive that you, Chris Uppal and Mark have a common point as regards the memory > barrier (fence) of synchronization. I'll use your previous posting as the basis for [quoted text clipped - 18 lines] > the value must be refreshed (as part the memory barrier) if the value is used *after* > the synchronized block (but not in the block). Understood.
> Earlier in the thread, I asserted that at the end of the synchronized block another > thread may, in a synchronized block, change the value of the shared variable. This means [quoted text clipped - 11 lines] > If you guys feel that 'degrees' of staleness are important or that my extreme case must > be catered to, I guess we will have to agree to disagree and end this discussion. I can't speak for Mark or Chris, but I insist that the behavior of a Java program must be among those consistent with the language specification, inasmuch as can be observed. How is one to write programs with predictable behavior if one cannot rely on the specs? You can argue that the specs are inconvenient or poorly conceived. Indeed, the JSR 133 working group did precisely that, and came up with what appears to be a significant improvement. If you are really supporting implementation of JVM / program behavior that is observably inconsistent with the spec, however, then we are indeed at an impasse, and this discussion cannot progress.
Assuming that you are not arguing for outright VM non-compliance, you seem to be arguing for different semantics than Java has chosen, even in the new memory model. That's fine, but I don't think I can reason with you about the relative merits of your preferred semantics vs. Java's specified semantics (or about whether there is a difference at all) without considerably more detail on what your preferred semantics actually are. Unless you're prepared to pledge support for one of the existing models, that would amount to you more or less presenting a whole memory model of your own, at least in outline. I'm guessing that (the latter) won't happen.
I do maintain that the old memory model requires that in your example, the first thread is required to refresh its copy of the shared variable in question some time between entering the synchronized block and subsequently using the value, even though the use is after the end of the synchronized block. I also maintain that the new memory model also requires such a refresh if the synchronization object is a shared object that any other thread in the program may synchronize on in the intervening time. In both cases there is room for the VM to optimize away the refresh if it can prove that the variable cannot be written by another thread between the read and the subsequent use.
 Signature John Bollinger jobollin@indiana.edu
Lee Fesperman - 18 May 2005 21:00 GMT > I can't speak for Mark or Chris, but I insist that the behavior of a > Java program must be among those consistent with the language [quoted text clipped - 6 lines] > with the spec, however, then we are indeed at an impasse, and this > discussion cannot progress. It seems that we are at the 'agree to disagree' stage. However, you added some parting shots below, which I will respond to.
> Assuming that you are not arguing for outright VM non-compliance, you > seem to be arguing for different semantics than Java has chosen, even in [quoted text clipped - 6 lines] > whole memory model of your own, at least in outline. I'm guessing that > (the latter) won't happen. No, I'm not arguing for a new memory model. The 'purpose' of the current (1.5) memory model is to ensure proper utilization of shared variables within shared synchronization blocks. Like most technical writing, it concerns itself with the mechanism rather than describing its purpose, its reason for existing. In order to cover the general case, it includes all shared variables (because of 'anonymous' locks). The fact that this mechanism also affects shared variable access outside the synchronized sections is a an unavoidable side effect. Depending on this side effect is very dicey coding and is improper/obtuse code. It can only be guaranteed to avoid stale values in extreme cases. Instead of referencing shared variables outside synchronized sections, a better technique would be to store the current value in a local variable inside the synchronized section for use outside the synchronization. Our system uses that exact technique to improve concurrency.
I'm not arguing for a new memory model. To achieve the capability I'm referring to would require the memory model spec add a section saying that a compiler could violate covering all shared variables when it knew it was 'safe'. That would just clutter the spec, just like the JLS section that says that *all* objects are allocated on the heap doesn't bother saying, "Oh, if the compiler is smart enough, it can violate this principle." It would have to add this in a lot of places, diluting the spec. The spec is intended to enforce a minimum level of compliance, to ensure the expectations of correct execution by even the most complex Java application are met.
No, I'm not arguing for outright VM non-compliance but for an "as if" reading of the spec. In my view, you're being overly rigid in this aspect (there has been a goodly amount of authoritarianism on this forum recently), since you can't describe a case where expecting special behavior of shared variables outside synchronization sections would be justified or useful.
 Signature Lee Fesperman, FFE Software, Inc. (http://www.firstsql.com) ============================================================== * The Ultimate DBMS is here! * FirstSQL/J Object/Relational DBMS (http://www.firstsql.com)
John C. Bollinger - 10 May 2005 23:10 GMT [...]
>>Synchronizing causes any values 'cached' in thread local memory to be >>written to main memory. It also invalidates any values previously read [quoted text clipped - 4 lines] > here? The memory barrier would only be important for values in the object being > synchronized, which no other thread could see or affect. You misunderstand. After the initial memory barrier (at entry to the synchronized block) the thread must reload from main memory *every* variable whose value it subsequently wants to use. Before the _second_ memory barrier (at exit from the synchronized block) the thread is obliged to write to main memory *all* externally-visible variables that it has modified since their load. The identity of the object synchronized is completely irrelevant: synchronized(new Object()) {} is exactly the same as synchronized(this) {} or synchronized(anythingElse) {} in this regard. This may affect shared variables used by methods further down the stack frame, which may belong to different classes, so it is impossible to determine at compile time that it is safe to remove the barrier.
The VM, with the whole program to work on, has a better chance of being able to determine whether the memory barrier can be removed. It may well still be that the barrier _cannot_ be removed without altering program semantics, however, and in any case a non-trivial analysis is required to make the determination.
> In the case we're discussing, the 1.5 compiler actually does optimize it away ;^) But in 1.5 a StringBuilder to implement the concatenation instead of a StringBuffer (as I understand it); one of the key advantages of the former is that it does not have an internal memory barrier.
 Signature John Bollinger jobollin@indiana.edu
ilkinulas - 09 May 2005 16:55 GMT we assume that log.isDebugEnabled returns "false" every time.
if the tests are executed under jvm version 1.5, than test1 and test2 give almost the same results. is it possible for a virtual machine to prepare the String s just before it needs it. i mean "test"+i; is calculated not in line 3 but before log.debug(s), if log debug is enabled. 1 public void test1() { 2 for (int i = 0; i < 10000000;i++) { 3 String s ="test"+i; 4 if(log.isDebugEnabled()) { 5 6 log.debug(s); 7 } 8 } 9 }
Tim Tyler - 09 May 2005 16:42 GMT ilkinulas <ilkinulas@gmail.com> wrote or quoted:
> two functions test1 and test2 does the same thing but test2 performs > nearly 20 times better than test1. JVM is unable to optimize the code [quoted text clipped - 23 lines] > } > -------------------------------------------------------------------------- There are a number of class file optimesers available that optimise after compilation.
http://www.geocities.com/marcoschmidt.geo/java-class-file-optimizers.html
...has a list.
Whether any of them will deal with your example, I don't know - and the answer may depend on what log.isDebugEnabled() actually does.
 Signature __________ |im |yler http://timtyler.org/ tim@tt1lock.org Remove lock to reply.
Kevin McMurtrie - 10 May 2005 07:34 GMT > Hi, > two functions test1 and test2 does the same thing but test2 performs [quoted text clipped - 24 lines] > } > -------------------------------------------------------------------------- Do you mean that log.isDebugEnabled() returns false? If so it's totally your fault for test1 being slower.
This: String s ="test"+i;
Compiles to: String s= new StringBuffer("test").append(i).toString();
Lots of code is hidden in such a simple expression. It is beyond the scope of the compiler to determine whether or not there are side effects in all of that. It can not omit its execution simply because the result is not used.
ilkinulas - 10 May 2005 10:03 GMT if i have a method for logging like this:
public void debug(String s) { if (log.isDebugEnabled()) { log.debug(s); } }
i would like to use method "debug" in this way debug("test" + someVariable); String s is constructed before checking "if debug is enabled". if debug is not enabled there is no need to concatenate "test" + someVariable.
Thomas Schodt - 10 May 2005 10:21 GMT > if i have a method for logging like this: > [quoted text clipped - 8 lines] > String s is constructed before checking "if debug is enabled". if debug > is not enabled there is no need to concatenate "test" + someVariable. You could do something like
// covers scalar primitives; byte,char,short,int,long public void debug(String s,long l) { if (!log.isDebugEnabled()) return; log.debug(s+l); }
public void debug(String s,boolean b) { if (!log.isDebugEnabled()) return; log.debug(s+b); }
// covers the rest - might even cover primitives in 1.5 ? public void debug(String s,Object o) { if (!log.isDebugEnabled()) return; log.debug(s+o.toString()); }
You can add as many debug() variants as you care to.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|