Java Forum / First Aid / December 2005
Xeon processor and Java - better multithreading?
Knute Johnson - 06 Dec 2005 19:57 GMT We are trying to make a hardware decision for a large project I'm working on. The computers are rack mount and we need pretty fast processors. The supplier wants to sell us a 2.4Ghz Xeon processor box. He says that will give us more performance than a 3.4Ghz P4. I don't really know much about the new processors. Can Sun's Java use the dual core features of the Xeon to good enough advantage to provide a performance increase over the P4? Anybody have any actual experience with multi-threaded programs and the Xeon processors?
Thanks,
 Signature Knute Johnson email s/nospam/knute/
Thomas Hawtin - 06 Dec 2005 20:36 GMT > We are trying to make a hardware decision for a large project I'm > working on. The computers are rack mount and we need pretty fast [quoted text clipped - 4 lines] > performance increase over the P4? Anybody have any actual experience > with multi-threaded programs and the Xeon processors? For servers multi-processor/multi-threaded is quite normal. Sun have just launched a series of 1U and 2U machines that have processors of up to 8 cores, each with 4 hardware threads.
Spec have some results:
http://www.spec.org/jbb2005/results/jbb2005.html
There's some new results from Sun that show how well multi-core processors scale (Sun's new machines are using four 64-bit memory channels).
http://blogs.sun.com/roller/page/dagastine?entry=ultrasparc_t1_screams_running_java
(Note, the Opteron in the bar chart is currently mislabeled 2 core, instead of 4.)
You need to be careful that your software does not have problems with hardware threads fighting for the same lock.
Tom Hawtin
 Signature Unemployed English Java programmer http://jroller.com/page/tackline/
Roedy Green - 06 Dec 2005 23:24 GMT On Tue, 06 Dec 2005 11:57:58 -0800, Knute Johnson <nospam@ljr-2.frazmtn.com> wrote, quoted or indirectly quoted someone who said :
>2.4Ghz Xeon processor box A Xeon is basically a 32-bit Pentium with 133 extra SIMD instructions for parallel processing. As far as I know, Sun's JVM/Hotspot almost totally ignores the parallel processing part of the CPU, so unless you exploit it with JNI, or if your womb or OS use those instructions, they are pretty much a waste. Depending on the model, you might have more than one core (cpu hardware) and hyperthreading, (hardware threads that simulate separate CPUs)
There are also now 64-bit Xeons. On a quick look at the Intel site, it looks as if these things use the Itanium 64-bit architecture. This is not as widely supported as the AMD Opteron. Last time I looked, Sun supports AMD Opteron but not Intel Itanium in 64-bit mode. IBM has a beta Itanium JVM for W2K. I don't know what your options are for OS and supporting software.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Thomas Hawtin - 06 Dec 2005 23:51 GMT > There are also now 64-bit Xeons. On a quick look at the Intel site, > it looks as if these things use the Itanium 64-bit architecture. > This is not as widely supported as the AMD Opteron. Last time I > looked, Sun supports AMD Opteron but not Intel Itanium in 64-bit > mode. IBM has a beta Itanium JVM for W2K. I don't know what your > options are for OS and supporting software. No, 64-bit Xeons use EM64T, which is Intel for AMD64. Some early chips with EM64T missed a few instructions off. Got to love Intel. Intel now produce AMD clones.
IIRC, in addition to providing 64-bit addressing, AMD64 has a slightly less brain-dead instruction set using a larger number of general purpose registers. It is that which allows 64-bit mode to run faster than 32-bit. Apparently on SPARC you are looking at a performance drop of 10-15% going to 64-bit, because pointers take up twice as much memory and hence require greater bandwidth.
Tom Hawtin
 Signature Unemployed English Java programmer http://jroller.com/page/tackline/
Roedy Green - 07 Dec 2005 02:29 GMT On Tue, 06 Dec 2005 23:54:01 +0000, Thomas Hawtin <usenet@tackline.plus.com> wrote, quoted or indirectly quoted someone who said :
>No, 64-bit Xeons use EM64T, which is Intel for AMD64. Some early chips >with EM64T missed a few instructions off. Got to love Intel. Intel now >produce AMD clones. I have added a number of entries to the Computer Buyers' glossary for the various CPUs. If you had time you might look at them and tell me where I screwed up or suggest extra notes.
See http://mindprod.com/bgloss/cpu.html and follow the links.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Roedy Green - 06 Dec 2005 23:25 GMT On Tue, 06 Dec 2005 11:57:58 -0800, Knute Johnson <nospam@ljr-2.frazmtn.com> wrote, quoted or indirectly quoted someone who said :
> Anybody have any actual experience >with multi-threaded programs and the Xeon processors? Perhaps you might find out what the huge server farms are using that are running software similar to yours.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Knute Johnson - 07 Dec 2005 16:48 GMT > On Tue, 06 Dec 2005 11:57:58 -0800, Knute Johnson > <nospam@ljr-2.frazmtn.com> wrote, quoted or indirectly quoted someone [quoted text clipped - 5 lines] > Perhaps you might find out what the huge server farms are using that > are running software similar to yours. Thanks very much for the info guys.
 Signature Knute Johnson email s/nospam/knute/
blmblm@myrealbox.com - 07 Dec 2005 17:22 GMT >> On Tue, 06 Dec 2005 11:57:58 -0800, Knute Johnson >> <nospam@ljr-2.frazmtn.com> wrote, quoted or indirectly quoted someone [quoted text clipped - 7 lines] > >Thanks very much for the info guys. A little more ....
Where I work, we recently purchased some new multi-processor (possibly it's more accurate to call them multi-core) machines, and I think the processors are Xeons. What I've observed:
(*) Programs with multiple threads (in Java and in C with OpenMP) can get nearly perfect speedups relative to the number of processors/cores (e.g., runtime on the computational kernel decreases by a factor of 2 with 2 processors, 4 with 4 processors) if the code being run is pure computation, with limited memory access and little need for synchronization. So if the code is such that it can make good use of multiple processors, the hardware/software platform seems to allow that to happen.
(*) Hyperthreading can help, but (as I understand it from talking to someone at Intel) only by masking latency. If your code can benefit from fast context switches, hyperthreading will help you. If it's pure computation (no waiting for I/O, memory, etc.), hyperthreading probably will not help. I did some short experiments and observed that, on a machine with two hyperthreaded processors, four threads are no faster than two, for a multithreaded program that's almost pure computation.
Take all of this with a big grain of salt, because I am far from an expert, but -- two cents' worth from one user, maybe.
| B. L. Massingill | ObDisclaimer: I don't speak for my employers; they return the favor. Roedy Green - 07 Dec 2005 19:52 GMT >Hyperthreading can help, but (as I understand it from talking to >someone at Intel) only by masking latency. If your code can benefit [quoted text clipped - 4 lines] >are no faster than two, for a multithreaded program that's almost >pure computation. Hyperthreading in theory might help when:
1. one thread is doing floating point and the other integer. You could get some parallelism going in the common core pipeline.
2. you have a dumb i/o device that interrupts per character. The other thread can carry on uninterrupted.
3. Let's say you had a number of i/o intensive tasks. For maximum efficiency, the one you want to run is the one that will do an i/o and block soonest. You want to do the CPU intensive stuff once you have all your i/o channels busy. I would hazard a guess that with two threads running at half speed you on average hit an i/o sooner. It is sort of a way of hedging your bets on which thread will do an i/o first.
I am just guessing here. It should be possible do a simple simulation of a CPU/i/o system to see what sorts of conditions favour hyperthreading. Surely the chip makers did some white papers on what hyperthreading buys you. Surely it is not just a marketing ploy.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
blmblm@myrealbox.com - 08 Dec 2005 10:45 GMT >>Hyperthreading can help, but (as I understand it from talking to >>someone at Intel) only by masking latency. If your code can benefit [quoted text clipped - 12 lines] >2. you have a dumb i/o device that interrupts per character. The other >thread can carry on uninterrupted. Seems plausible. It would be interesting for someone to do the experiments anyway.
>3. Let's say you had a number of i/o intensive tasks. For maximum >efficiency, the one you want to run is the one that will do an i/o and [quoted text clipped - 3 lines] >It is sort of a way of hedging your bets on which thread will do an >i/o first. I'm a little skeptical of this one, but maybe.
>I am just guessing here. It should be possible do a simple simulation >of a CPU/i/o system to see what sorts of conditions favour >hyperthreading. Surely the chip makers did some white papers on what >hyperthreading buys you. Surely it is not just a marketing ploy. I don't know whether they did simulations ahead of time -- though one would think they would -- but I *think* (but this is based on unspecific conversations with a colleague) that Intel did experiments once they'd built the hardware and found that performance improvement ranged from zero to 30%. So, not a marketing ploy, but also not a silver bullet for all applications?
| B. L. Massingill | ObDisclaimer: I don't speak for my employers; they return the favor. Oliver Wong - 08 Dec 2005 18:51 GMT >>3. Let's say you had a number of i/o intensive tasks. For maximum >>efficiency, the one you want to run is the one that will do an i/o and [quoted text clipped - 5 lines] > > I'm a little skeptical of this one, but maybe. Tihs sounds plausible to me. The other extreme is doing all your CPU intensive tasks first. Fine, they're finished. Then you do the I/O task, and it blocks. Now the CPU is idle. Seems more wasteful than trying to do the I/O task first, and then, when it blocks, you keep the CPU busy with the CPU intensive tasks.
- Oliver
Thomas Hawtin - 08 Dec 2005 20:41 GMT > I am just guessing here. It should be possible do a simple simulation > of a CPU/i/o system to see what sorts of conditions favour > hyperthreading. Surely the chip makers did some white papers on what > hyperthreading buys you. The big advantage is not in sharing functional units. Indeed, non-Intel takes on the concept tend, in any particular cycle, to process only instructions from the same thread.
The primary problem that multi-threaded hardware is attempting overcome is the delay between requesting data from DRAM and it actually arriving. In terms of cycles, this delay is becoming longer and longer. Simultaneous multi-threading allows the core to remain busy even when one thread is waiting for the memory to catch up.
To a lesser extent, having multiple hardware threads reduces the need for thread context switches. I guess that will be more important on very low power devices.
Sun Niagra (UltraSPARC T1) uses four hardware threads per core. Rumours are that Niagra II will have eight hardware threads per core.
> Surely it is not just a marketing ploy. Sharing functional units between threads is more sexy than advertising how slow your memory is.
Tom Hawtin
 Signature Unemployed English Java programmer http://jroller.com/page/tackline/
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|