Java Forum / General / May 2006
does java program run on 64-bit platform as fast as on 32-bit platform
jcc - 10 May 2006 11:44 GMT Since the size of integer is fixed (32 bits) on any platform, does java program run on 64-bit platform as fast as on 32-bit platform?
Roedy Green - 10 May 2006 18:28 GMT >Since the size of integer is fixed (32 bits) on any platform, does java >program run on 64-bit platform as fast as on 32-bit platform? Java runs on a wide range of 32 and 64 bit platforms. If you have enough money you can get yourself a mainframe or a CPU with 256 processors.
But I think what you are getting at is the observation that given the same amount of silicon, a 64 bit CPU will typically be slower since more RAM and bandwidth are used to get the same job done. The reason for 64 bits is to address huge virtual memories. You can plausibly do all your IO with memory mapping or virtual RAM allowing you to boost performance simply by buying more real RAM without redesigning your software.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
jmcgill - 10 May 2006 19:10 GMT > Since the size of integer is fixed (32 bits) on any platform, does java > program run on 64-bit platform as fast as on 32-bit platform? The java program is the same, so the question really addresses the implementation of the JVM that runs it. And the question of which is 'faster' depends on a great many things.
Between a given 64-bit datapath and a given 32-bit datapath, each may have pros and cons versus the other.
A 64-bit control might be able provide some very efficient instructions, and a program (a JVM in this case) might be coded to take advantage of that. Or, a 64-bit control might be implementing the same instructions as the 32-bit processor.
A 64-bit processor should have an order of magnitude more registers compared to a 32-bit processor. If the program can exploit this, then some operations that would require loading from memory (typically the slowest instruction) can work in register space instead.
A 64-bit processor is likely to have been designed from scratch relatively recently, and thus, may have a very sophisticated forwarding/pipelining architecture which could make it more efficient than another chip executing the same series of instructions.
A 64-bit processor has a much larger addressing space, and much wider buses. This means that things like sequential reads from memory can be optimized by grabbing big chunks of RAM per load instruction, in anticipation of spatial locality. It also means that cache associations can be made much wider, and caches can be much taller than the equivalent 32-bit variety, and this should also result in a more efficient datapath.
There are some diminishing returns, even some liabilities, when so many transistors are put on a chip, as is implied by the requirement of a 64-bit datapath. So we are likely looking at a multicore processor instead of a one giant single path. Then it's a significant factor whether the program (the JVM and the OS) can exploit the performance benefits implied by the processor architecture.
Now your question was only about integers. That's a tough one. Many of the considerations around the datapath also apply in various ways to the ALU (arithmetic logic unit(s)). On the other hand, a 64-bit ALU might not necessarily outperform a 32-bit one with the same operations on 32-bit register data.
Overall, I would certainly expect a 64-bit processor to do no worse than a 32-bit processor running the same program, but I would also expect certain benefits of the 64-bit datapath to give increased performance as a side effect of the wider paths, more registers, cache design, etc. Further, if the program was designed for the 64-bit architecture, then it should be expected to take advantage of any performance benefits that are offered by that architecture.
Regardless, the *program* isn't the java program, it's the JVM.
alexandre_paterson@yahoo.fr - 10 May 2006 20:06 GMT > > Since the size of integer is fixed (32 bits) on any platform, does java > > program run on 64-bit platform as fast as on 32-bit platform? ...
> A 64-bit processor should have an order of magnitude more registers > compared to a 32-bit processor. "an order of magnitude" ^^^^^^^^^^^^^^^^^^^^^^
I guess you're counting, like the processors, in base 2 :)
Seriously though, a 32 bit x86 can be considered to have 8 GPR (general purpose register) and 8 FPR (floating-point register)... You won't find many 64 bit CPUs having 80 GPR and 80 FPR these days (though there have been some having as much as 64 GPR IIRC, but this isn't very common on the most successfull 64 bit platforms Java runs on today if I'm not mistaken).
Just nitpicking that said, I agree with everything you said!
jmcgill - 10 May 2006 20:15 GMT >>> Since the size of integer is fixed (32 bits) on any platform, does java >>> program run on 64-bit platform as fast as on 32-bit platform? [quoted text clipped - 6 lines] > > I guess you're counting, like the processors, in base 2 :) I should have said "it *could* have"
I'm thinking a 64-bit MIPS instruction could simply double the register index from 5 bits to 10. Instead of 32 registers you could have 1024 without really changing the basic architecture. Obviously with the wide instruction we could be a lot smarter than that.
I admit I'm only guessing, and my only experience with any real 64-bit processors has been running in 32-bit mode on 64-bit sun hardware.
I would actually like to see benchmarks based on whatever the currently popular 64-bit machines happen to be, and I would be very interested in reading opinions from people who have implemented JRE (and OS) for 64 bit.
Oliver Wong - 10 May 2006 21:57 GMT >> > Since the size of integer is fixed (32 bits) on any platform, does java >> > program run on 64-bit platform as fast as on 32-bit platform? [quoted text clipped - 16 lines] > > Just nitpicking that said, I agree with everything you said! It's been a long time since I've done development in assembly, but I thought it was only Intel's x86 design which was register-starved, and most other architectures had a lot more. I vaguely remember one processor I worked with (Motorola PowerPC?) which had something like 120 physical registers and a rotating window circular-buffer thing to emulate an infinite number of registers, given sufficient amount of RAM.
- Oliver
jmcgill - 10 May 2006 22:01 GMT > It's been a long time since I've done development in assembly, It's been less than a week for me :-)
> rotating window circular-buffer thing to > emulate an infinite number of registers, given sufficient amount of RAM. If you have to load a value from RAM, you not only are defeating the purpose of a register access, but you are also replacing register access, the fastest operation in the datapath, with memory access, the slowest.
Oliver Wong - 10 May 2006 22:09 GMT > > It's been a long time since I've done development in assembly, > [quoted text clipped - 6 lines] > purpose of a register access, but you are also replacing register access, > the fastest operation in the datapath, with memory access, the slowest. Let's say the registers are numbered 0 to 119. You start up with the window pointing at registers 0 to 19. You do some fiddling there, and decide you want more a new set of registers (e.g. because you're about to jump to a sub procedure), so the window rolls over to point to 20 to 39. When you start getting dangerously near to 119, a background process kicks in and saves the values from 0 to 19 to RAM, so by the time you tell the CPU you want the window to roll over and give you the logical registers 120 to 139, you'll actually be using the physical registers 0 to 19, and no data lost will occur. Also, since the background process intelligently waits until the memory channels are idle before doing its background save, no delay is noticed.
Something like that, anyway. It's been a couple of years since I've worked on this architecture.
- Oliver
jmcgill - 10 May 2006 22:18 GMT > Something like that, anyway. It's been a couple of years since I've > worked on this architecture. I try to avoid any routine that needs more than 2 or 4 concurrent registers anyway. My sanity depends on that :-)
Roedy Green - 11 May 2006 01:03 GMT On Wed, 10 May 2006 14:18:30 -0700, jmcgill <jmcgill@email.arizona.edu> wrote, quoted or indirectly quoted someone who said :
>I try to avoid any routine that needs more than 2 or 4 concurrent >registers anyway. My sanity depends on that :-) Try writing multiprecision routines e.g. 64 bit divide with 16 bit registers or analog. You soon feel like a one-armed juggler.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
jmcgill - 11 May 2006 01:09 GMT > On Wed, 10 May 2006 14:18:30 -0700, jmcgill > <jmcgill@email.arizona.edu> wrote, quoted or indirectly quoted someone [quoted text clipped - 5 lines] > Try writing multiprecision routines e.g. 64 bit divide with 16 bit > registers or analog. You soon feel like a one-armed juggler. Thanks Roedy, but I have a hard enough time just using libm :-)
alexandre_paterson@yahoo.fr - 10 May 2006 22:47 GMT > >> > Since the size of integer is fixed (32 bits) on any platform, does java > >> > program run on 64-bit platform as fast as on 32-bit platform? [quoted text clipped - 20 lines] > thought it was only Intel's x86 design which was register-starved, and most > other architectures had a lot more. Hi Oliver,
yup, which was exactly my (funny) point: even going from x86 32-bit to today's 64 bit CPU that run 64 bit JVM (ie not all 64 bit CPU), the topic of the thread), you'd be unlikely to find 10x as more "real" GPR (general purpose register).
Which is why I specified "on x86 32 bit" btw :)
> I vaguely remember one processor I worked with (Motorola > PowerPC?) which had something like 120 physical registers I'd have said 32 real registers on some PPC at least for sure and it could indeed have been much much more.
:) Chris Uppal - 11 May 2006 10:36 GMT > It's been a long time since I've done development in assembly, but I > thought it was only Intel's x86 design which was register-starved, IA32 isn't /really/ that register-starved, it only looks that way in the virtual machine language. The number of actual registers depends on the specific chip.
Back onto the OP's question. I don't have /any/ real data myself, but it seems to me that a 64-bit JVM will requre larger memory caches to achieve the same level of performance as a 32-bit one, since it will have to move more data around on each object access. My /guess/ is that that will have little effect on the time it takes to deferrence an address which is in-cache, nor in the time it takes to refresh one cache line, but it will effectively reduce the size of each cache. So, unless the 64-bit CPUs also come with oversize caches then I'd expect a moderate performance hit on "normal" OO code (tight arithmetic loops would presumably not be affected). And of course, a 32-bit machine with that much cache would run faster anyway, so there's a sense in which a 64-bit machine "wastes" cache-space, and so time.
Putting the same idea a different way, I'd expect an equally performing 64-bit machine to cost more than a 32-bit machine, not only becuase of the Fundamental Law of Computing ("bigger numbers cost more"), but in order to pay for anciliary support hardware like more RAM and bigger caches.
-- chris
Hendrik Maryns - 11 May 2006 12:35 GMT -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 NotDashEscaped: You need GnuPG to verify this message
Chris Uppal uitte de volgende tekst op 05/11/2006 11:36 AM:
>> It's been a long time since I've done development in assembly, but I >> thought it was only Intel's x86 design which was register-starved, [quoted text clipped - 19 lines] > Law of Computing ("bigger numbers cost more"), but in order to pay for > anciliary support hardware like more RAM and bigger caches. I won?t say anything about prices (the university paid, my colleague did the paperwork), but from testing on some of the benchmark programs that have appeared in this NG, a 64-bit processor emulating 32-bit (that is, working with a 32-bit Java) is way slower than using 64-bit (i.e. 64-bit Java).
H.
 Signature Hendrik Maryns
================== www.lieverleven.be http://aouw.org
Nigel Wade - 12 May 2006 09:52 GMT > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 [quoted text clipped - 33 lines] > > H. That depends which 64bit processor. If it's an Itanium it's a well know problem - they just were not designed to run 32bit code. That's the principle reason we chose Opteron over Itanium.
In my tests I see no discernible difference between a 32bit and 64bit JVM running on a 64bit Linux platform with Opteron processors. That extends to non-Java code as well. I've seen plenty of benchmarks which show that the Opteron can be quicker running 64bit than 32bit, but that's not my experience in practise with real-world apps. which we use.
 Signature Nigel Wade, System Administrator, Space Plasma Physics Group, University of Leicester, Leicester, LE1 7RH, UK E-mail : nmw@ion.le.ac.uk Phone : +44 (0)116 2523548, Fax : +44 (0)116 2523555
Roedy Green - 11 May 2006 18:20 GMT On Thu, 11 May 2006 10:36:22 +0100, "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> wrote, quoted or indirectly quoted someone who said :
>The number of actual registers depends on the >specific chip. Yes there are extra registers, but not ones you use in ordinary code to store intermediate results. The X86 has an unusually low number of registers and the registers are not orthogonal. I remember writing my first Motorola 68K assembler and being amazed at how simple it was in comparison with the 8086 with every register orthogonal without magic properties.
 Signature Canadian Mind Products, Roedy Green. http://mindprod.com Java custom programming, consulting and coaching.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|