Java Forum / General / March 2006
System.nanoTime and multiple cpus/cores
transpendence@googlemail.com - 17 Mar 2006 14:53 GMT I've tried to use System.nanoTime to make precise measures of timing intervalls. I't works great - but only as long as the program runs on one cpu only. If there are multiple cpus/cores, the running thread seem to switch between different cpus and each cpu seem to have a different timer base the result of nanoTime is jumping forward and backward in time, depending on which cpu the thread is currently running.
Is there a possibility to force threads to a single cpu directly in java or to use another high-precession timer (I need ms resolution and it should work on Windows too)?
Hendrik Maryns - 17 Mar 2006 15:08 GMT -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 NotDashEscaped: You need GnuPG to verify this message
transpendence@googlemail.com schreef:
> I've tried to use System.nanoTime to make precise measures of timing > intervalls. I't works great - but only as long as the program runs on > one cpu only. If there are multiple cpus/cores, the running thread seem > to switch between different cpus and each cpu seem to have a different > timer base the result of nanoTime is jumping forward and backward in > time, depending on which cpu the thread is currently running. Are you sure it is due to multithreading? There are these small particles that are known to be able to jump back in time (have to do some reading up on quantummechanics before posting stuff like this).
:-) H.
 Signature Hendrik Maryns
================== www.lieverleven.be http://aouw.org
lewmania942@yahoo.fr - 17 Mar 2006 16:40 GMT > > If there are multiple cpus/cores, the running thread seem > > to switch between different cpus and each cpu seem to have a different [quoted text clipped - 6 lines] > > :-)
:) But... Don't you think his explanation may be *exactly* what is going on? To me nanoTime() is not very precise when running with several cores/cpus, so the OP's explanation doesn't seem far-fetched at all (but I may be wrong).
Heck, even the famous assembly "rdtsc" instruction (mentionned on Roedy's site btw) could only be used to measure timing accurately if and only if the pipeline was flushed, to prevent out-of-order instructions execution. This required hacks and... serious performance drops (flushing the pipeline could be done by using cpuid).
That said, I can't wait to have Java 9874 which implements System.picoTime(): this time it *really* is accurate... Then two months later Intel starts selling the new virtual-multi-transparent -woozing-buzz-architreadhed-cored-processor and picoTime() isn't really that precise anymore. Repeat ad nauseam.
:) Not that I personnally need sub-nanosecond precision timer or anything ;)
lewmania942@yahoo.fr - 17 Mar 2006 16:06 GMT > I've tried to use System.nanoTime to make precise measures of timing > intervalls. I't works great - but only as long as the program runs on > one cpu only. I recall an excellent thread (was it an article?) explaining why you can't rely on System.nanoTime() to give ultra-precise results when running on multi-cpu/cores... But can't find it back :(
You can't entirely reject the possibility of an error in the value given back by System.nanoTime() either: if you search the web you'll find at least one such bug (I think it was one version of Windows that was at fault).
> Is there a possibility to force threads to a single cpu directly in > java Directly in Java no, even if years ago there have been talks about this at Sun. They said that "maybe one day" you'd have the ability to call a method like:
setCPUAffinity()
to define the "affinity mask". This would have solved your problem but AFAIK is has never been implemented (and may not even be possible to implement at the JVM level).
That said, maybe you can force the affinity mask at the OS level if you *really* need it (I'd rather let the OS scheduler decide how the cores/cpus are used).
transpendence@googlemail.com - 17 Mar 2006 16:43 GMT I can force it to a single cpu via the windows task manager (the problems are gone then) - but only after the program has started. And it limits the whole process to a single cpu.
But it seems I've found a solution:
After some searching, I've found that System.nanoTime() uses QueryPerformanceCounters() on Windows and that this function is known to have problems on Athlon64 multicore systems. I've found a hint to use /usepmtimer in win.ini. Don't know if it really always works, but after I changed it, the problems are gone.
Roedy Green - 17 Mar 2006 19:56 GMT >I've tried to use System.nanoTime to make precise measures of timing >intervalls. I't works great - but only as long as the program runs on >one cpu only. If there are multiple cpus/cores, the running thread seem >to switch between different cpus and each cpu seem to have a different >timer base the result of nanoTime is jumping forward and backward in >time, depending on which cpu the thread is currently running. that's a bug. Java is supposed to compensate for that.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Patricia Shanahan - 17 Mar 2006 20:26 GMT >>I've tried to use System.nanoTime to make precise measures of timing >>intervalls. I't works great - but only as long as the program runs on [quoted text clipped - 4 lines] > > that's a bug. Java is supposed to compensate for that. I agree its a bug, but I'm not sure Java can compensate for it. The JVM does not necessarily know when a thread moves. The operating system does know, and should be providing a consistent timer at the syscall, or equivalent, level.
Patricia
Roedy Green - 17 Mar 2006 23:17 GMT >I agree its a bug, but I'm not sure Java can compensate for it. The JVM >does not necessarily know when a thread moves. The operating system does >know, and should be providing a consistent timer at the syscall, or >equivalent, level. hmm. You would have to enqueue a request to a fixed timer thread. That of course defeats the fine grain resolution.
Is there at least an integer index of CPU you could grab at the same time as the RDTSC? Intels have a serial number, which can be disabled. AMDs don't.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
lewmania942@yahoo.fr - 18 Mar 2006 02:42 GMT > hmm. You would have to enqueue a request to a fixed timer thread. > That of course defeats the fine grain resolution. Indeed.
Do you know how I can find (out of curiosity) how nanoTime() is implemented in Java 1.5 (and/or 1.6) ? (Say under Windows XP and under Linux).
RDTSC is flawed anyway... As I wrote in another post in this thread, to have a "real" fine-grained RDTSC you have to flush the pipeline before using the instruction, which in itself kind of defeats the purpose.
Moreover with all the CPU that throttle their speed (such as many Notebook CPUs) RDTSC is basically useless.
And apparently on some hyper-threading systems, methods like Window's QueryPerformanceCounter sometimes falls back to RDTSC...
This is giving headaches to many game programmers ;)
Chris Uppal - 18 Mar 2006 11:22 GMT > Do you know how I can find (out of curiosity) how nanoTime() is > implemented in Java 1.5 (and/or 1.6) ? (Say under Windows XP > and under Linux). You could look at the source. I recently did, and posted the following in another thread:
http://groups.google.com/group/comp.lang.java.programmer/tree/browse_frm/thread/ 1e2fd0fd152c47ad/8d1d4c0006079458?rnum=21#doc_b9ce35b9f7bc0e69
(That's from the 1.5 source).
> Moreover with all the CPU that throttle their speed (such as many > Notebook CPUs) RDTSC is basically useless. Yup.
> And apparently on some hyper-threading systems, methods like > Window's QueryPerformanceCounter sometimes falls back to > RDTSC... Do you have a link/reference for that ?
-- chris
Mark Thornton - 18 Mar 2006 11:43 GMT >>And apparently on some hyper-threading systems, methods like >>Window's QueryPerformanceCounter sometimes falls back to >>RDTSC... > > Do you have a link/reference for that ? The implementation of QueryPerformanceCounter seems to be in the part of the kernel that differs between single processor or multi processor implementations. In my experience on multiprocessors it is always implemented by the RDTSC instruction. On single processors QPC is implemented via the timer counter.
Mark Thornton
Chris Uppal - 19 Mar 2006 13:35 GMT > The implementation of QueryPerformanceCounter seems to be in the part of > the kernel that differs between single processor or multi processor > implementations. In my experience on multiprocessors it is always > implemented by the RDTSC instruction. On single processors QPC is > implemented via the timer counter. Interesting, thanks.
I wonder what they'll do with mutl-core laptops...
-- chris
lewmania942@yahoo.fr - 18 Mar 2006 16:02 GMT Hi Chris,
...
> > And apparently on some hyper-threading systems, methods like > > Window's QueryPerformanceCounter sometimes falls back to > > RDTSC... > > Do you have a link/reference for that ? sadly I've no handy link... But I recall reading this from more than one place and even seeing a nice little program somehow "prooving" this. Googling and browsing endless threads in obscure forums should eventually lead to some interesting infos on the subject, but it seems I can't find it that easily :( (I found some other stuff though)
That said, Patricia was right (as usual), when she said that it's the OS who should be providing a consistent timer.
And I wasn't entirely correct when I said that none of the OSes do this today...
The "new way" of doing in in modern processors is apparently called HPEC (working both on newer Intel and AMDs) :
http://www.intel.com/hardwaredesign/hpetspec.htm
Wikipedia is not very lengthy on the subject:
http://en.wikipedia.org/wiki/High_Precision_Event_Timer
It is not implemented yet in any Windows version (apparently Dell even disables it in some BIOS that otherwise would provide the functionality, on the basis that no desktop Windows use it yet).
But... It is already working on some other systems. For example some Linux kernel (if I read correctly) now have a gettimeofday() that use the underlying "this-time-really-high-precision-and- consistent-amongst-threads-and-cpus-we-promise-you" HPET timer (now that's redundant, as the 'T' is for "timer" ;)
So it may be possible that some people using System.nanoTime() are already benefiting from this new high precision event timer afterall.
Regarding my previous question "how to know how System.nanoTime() is implemented?", I somehow expected that the answer was "RTFS" (Read The Fine Source), but, concretely, how do I do this?
Do I have access to all the native code too ? (and if I want to see how some JNI method is done on Windows but I've got a Linux JDK, does it mean I've got to download a Windows JDK ?)
Thanks and talk to you all very soon
Chris Uppal - 19 Mar 2006 13:50 GMT > The "new way" of doing in in modern processors is apparently > called HPEC (working both on newer Intel and AMDs) : > > http://www.intel.com/hardwaredesign/hpetspec.htm Thanks.
> Regarding my previous question "how to know how > System.nanoTime() is implemented?", I somehow [quoted text clipped - 5 lines] > Windows but I've got a Linux JDK, does it mean I've > got to download a Windows JDK ?) You can download the entire platform source from the normal download page:
http://java.sun.com/j2se/1.5.0/download.jsp
Even more than normal, check the license /very/ carefully before accepting it. It is /not/ the same licence as the JDK or JRE. In fact, there are two licences you can opt for, one of which is entirely abominable, the other of which might be acceptable.
That contains (afaik) the entire source for Windows, Linux, and Solaris builds, including the C++ source for the JVM and the native methods. Pretty big[*] and, though it's not badly structured, it may take you a while to learn your way around it. It helps if you are reasonably familiar with JNI.
([*] around 200 meg, nearly 20K files.)
You'll find (if you do accept the licence) the C++ method os::elapsed_counter(), which is where the nano timer gets its data, defined in several files (according to OS). The Windows implementation is in: <root>/hotspot/src/os/win32/vm/os_win32.cpp
-- chris
lewmania942@yahoo.fr - 18 Mar 2006 16:13 GMT Hi Roedy,
...
> Is there at least an integer index of CPU you could grab at the same > time as the RDTSC? Intels have a serial number, which can be > disabled. AMDs don't. AMD recently introduced the RDTSCP: Read Serialized TSC Pair. (no idea how it works).
Here's a link to how to "hack around" the tricky TSC (when no HPET is available on the underlying hardware) :
http://lkml.org/lkml/2005/11/4/173
lewmania942@yahoo.fr - 18 Mar 2006 03:31 GMT Hi Patricia,
...
> I agree its a bug, but I'm not sure Java can compensate for it. The JVM > does not necessarily know when a thread moves. The operating system does > know, and should be providing a consistent timer at the syscall, or > equivalent, level. but apparently doesn't provide it.
:( I found back a thread from 2003 on an Intel forum... (I'm pretty sure the hundreds-of-mega-bytes of patch Windows has had since that time didn't fix that problem and the situation on Linux OSes doesn't seem any better ;)
http://softwareforums.intel.com/ids/board/message?board.id=42&message.id=155
"You could set up the OS to support high precision "virtual timers or virtual TSC's (it's fairly trivial) "but it's not currently there in any OS
Now I'm all ears: if someone can show me how to cleanly have a Java high precision timer on a multi-cored-multi-cpu-hyper- threaded-(insert latest CPU feature)-system providing nanosecond (or sub-nanosecond) accuracy without side effect (for example without flushing any pipeline), I'll read very carefully. It has to work on Intel, AMDs, and all the others and also, of course, on various OSes.
Until then, I'll code my Java apps without relying on System.nanoTime() giving very meaningfull values (ie: without hoping it'll really provide a high-precision timer on the various architectures the JVM run on)
:)
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|