Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / March 2008

Tip: Looking for answers? Try searching our database.

java versus C or C++ for number crunching

Thread view: 
johnmortal.forums@gmail.com - 25 Mar 2008 22:51 GMT
This is the sort of question that I hope won't start an unhappy
discussion, but I wanted to know whether there are any well accepted
tests comparing java to C++ (or C) for doing extensive number
crunching (e.g. multiplying 100,000 vectors in three space by various
matrics, and maybe even a lot of trig used to generate those matrices,
so lots of addition and multiplication). I am using C++ for a number
crunching intensive project because I have been so insistently
informed that Java is slow at number crunching and because a discrete
fast fourier procedure I wrote really did seem surprisely slow in Java
(but hey, maybe thats my fault, though I did use the fast version of
the transform). But if Java is not really lagging on number crunching
I would love to switch as I like Java so much. I could write my own
little test program easily enough, but everytime someone posts that
they have done so it seems like there are a lot of explanations posted
about why whatever they wrote is a bad test. Are there any accepted
"good tests" on number crunching that have been run recently?

Thank you
-John
Kenneth P. Turvey - 25 Mar 2008 23:45 GMT
On Tue, 25 Mar 2008 14:51:35 -0700, johnmortal.forums wrote:

> This is the sort of question that I hope won't start an unhappy
> discussion, but I wanted to know whether there are any well accepted
> tests comparing java to C++ (or C) for doing extensive number crunching
> (e.g. multiplying 100,000 vectors in three space by various matrics, and
> maybe even a lot of trig used to generate those matrices, so lots of
[Snip]

Some basic guidelines..

C++ will be faster with object creation and destruction .. a lot faster
in my experience.  If you are going to be creating and destroying many
objects the math won't matter.

Java is faster, or as fast in math on primitive types.  

Java on Intel will be much slower than C on trig functions.  This problem
can be reduced a bit by having a wrapper class that does the trig calls
using JNI, but it will still be slower than C.  On other hardware
platforms this isn't a problem.  On some Intel it might not be a problem
based on the results of an experiment recently conducted in this forum on
another thread (see Sines and Cosines).  

Signature

Kenneth P. Turvey <kt-usenet@squeakydolphin.com>

Peter Duniho - 25 Mar 2008 23:54 GMT
> C++ will be faster with object creation and destruction .. a lot faster
> in my experience.  If you are going to be creating and destroying many
> objects the math won't matter.

Is this really true?

In C#, allocations are _much_ faster than in C++, because of the heap  
management differences between the two.  Because C#'s advantage comes  
primarily from the implementation of its garbage collection system, I had  
just assumed that Java had a similar garbage collection implementation and  
thus shared a similar advantage over C++ in that respect.

C# incurs some extra overhead relative to C++ memory management when it  
eventually has to clean up objects, but this rarely impacts performance,  
as the collection only happens when there's memory pressure and/or idle  
moments.  The net is that code that does a lot of allocations usually  
performs better in C# than C++ (though most often there's very little  
practical difference).

It surprises me to hear that Java is significantly _slower_ than C++.  
That would imply that it's got the worst of both worlds: the reclaiming  
overhead of a garbage collecting memory manager, and the allocation  
overhead of a free-list based memory manager.

Surely that's not actually the case?

Pete
Mark Thornton - 26 Mar 2008 00:09 GMT
>> C++ will be faster with object creation and destruction .. a lot faster
>> in my experience.  If you are going to be creating and destroying many
[quoted text clipped - 13 lines]
> as the collection only happens when there's memory pressure and/or idle
> moments.
In big computational tasks you don't have idle moments and eventually
you do have to clean up. So you need to consider the overall cost of
allocation and deallocation. As with C#, an allocation in Java is pretty
trivial (not much more than a pointer increment). But garbage collection
is not free.

> It surprises me to hear that Java is significantly _slower_ than C++.  
It isn't, but it does depend on the memory use patterns. Tasks that can
be performed with a strict stack allocation pattern favour C++, those
with complex lifetimes (and especially if multithreaded) favour Java (or
C#).

Mark
Peter Duniho - 26 Mar 2008 00:35 GMT
> In big computational tasks you don't have idle moments and eventually  
> you do have to clean up.

I understand that.  But that's a special case.  For a very broad class of  
algorithms, that caveat doesn't apply and the generalization stated --  
"C++ will be faster with object creation and destruction" -- would not be  
valid.

Also, while there is overhead associated with collecting objects, it can  
be relatively inexpensive, especially if no heap compaction is required  
(many allocation patterns lend themselves to that situation).

It's certainly true that one can come up with scenarios in which  
C++ handles memory management faster than C# (and I guess from your  
comments, Java).  But the converse is true as well, and IMHO there's no  
valid generalization that correctly describes the relative performance  
characteristics of those languages.  The best one can say is "it depends".

>> It surprises me to hear that Java is significantly _slower_ than C++.
>
> It isn't, but it does depend on the memory use patterns. Tasks that can  
> be performed with a strict stack allocation pattern favour C++, those  
> with complex lifetimes (and especially if multithreaded) favour Java (or  
> C#).

Okay, that makes more sense (and is different from what I was replying to).

Thanks,
Pete
Kenneth P. Turvey - 26 Mar 2008 02:17 GMT
> I understand that.  But that's a special case.  For a very broad class
> of algorithms, that caveat doesn't apply and the generalization stated
[quoted text clipped - 11 lines]
> characteristics of those languages.  The best one can say is "it
> depends".

I can't really give you the details on why these algorithms work out to
be faster in C, but I can give you my experience.  The one thing that
might be an issue is that structures that might have been allocated on
the stack in C, end up in the heap in Java.  

In my experience, and I'm strictly talking about code that doesn't really
ever have a time when it isn't doing something here,  C++ will be much
faster if the code creates a lot of objects.  I can't really give you the
reasons behind this, but only the results.  I also can't really say
anything about C#, since I haven't programmed on that platform.

Signature

Kenneth P. Turvey <kt-usenet@squeakydolphin.com>

Peter Duniho - 26 Mar 2008 02:23 GMT
> I can't really give you the details on why these algorithms work out to
> be faster in C, but I can give you my experience.  The one thing that
> might be an issue is that structures that might have been allocated on
> the stack in C, end up in the heap in Java.

Could very well be.  And C# doesn't have that limitation, since it has the  
concept of non-reference (value) types, which can be allocated on the  
stack.

That said, it seems to me that your original statement could use  
refinement.  Specifically, you didn't qualify the general "C++ will be  
faster" statement with "in my experience".  Only the "a lot faster", which  
implies to me that you're saying C++ is always faster, and in your  
experience it's always faster by "a lot".

I believe in fact, especially given what else others have written here,  
that it's likely that it's not true that C++ is always faster.  I can  
easily believe that for a certain class of algorithms, C++ is always  
faster with respect to memory management, but that's a lot different from  
saying that it's always faster.

That's all I'm trying to say.

Pete
Stefan Ram - 26 Mar 2008 02:48 GMT
>for a certain class of algorithms, C++ is always  
>faster with respect to memory management,

 Quotations regarding memory management:

     »Your essay made me remember an interesting phenomenon I
     saw in one system I worked on. There were two versions of
     it, one in Lisp and one in C++. The display subsystem of
     the Lisp version was faster. There were various reasons,
     but an important one was GC: the C++ code copied a lot of
     buffers because they got passed around in fairly complex
     ways, so it could be quite difficult to know when one
     could be deallocated. To avoid that problem, the C++
     programmers just copied. The Lisp was GCed, so the Lisp
     programmers never had to worry about it; they just passed
     the buffers around, which reduced both memory use and CPU
     cycles spent copying.«

<XNOkd.7720$zx1.5584@newssvr13.news.prodigy.com>

     A lot of us thought in the 1990s that the big battle would
     be between procedural and object oriented programming, and
     we thought that object oriented programming would provide
     a big boost in programmer productivity. I thought that,
     too. Some people still think that. It turns out we were
     wrong. Object oriented programming is handy dandy, but
     it's not really the productivity booster that was
     promised. The real significant productivity advance we've
     had in programming has been from languages which manage
     memory for you automatically.

http://www.joelonsoftware.com/articles/APIWar.html

 Regarding the topic of this thread:

     »Java running faster than C«

http://paulbuchheit.blogspot.com/2007/06/java-is-faster-than-c.html

     »Java theory and practice: Urban performance legends,
     revisited Allocation is faster than you think, and getting
     faster«

http://www.ibm.com/developerworks/java/library/j-jtp04223.html

     »Java vs. C benchmark«

http://www.stefankrause.net/wp/?p=4
http://www.stefankrause.net/wp/?p=6

     »Performance of Java versus C++«

http://www.idiom.com/~zilla/Computer/javaCbenchmark.html

     »The Computer Language Benchmarks Game«

http://shootout.alioth.debian.org/

     »How many times faster or smaller are the Java 6 -server
     programs than the corresponding C GNU gcc programs?«

http://shootout.alioth.debian.org/debian/java.php
Lew - 26 Mar 2008 02:51 GMT
> In my experience, and I'm strictly talking about code that doesn't really
> ever have a time when it isn't doing something here,  C++ will be much
> faster if the code creates a lot of objects.  I can't really give you the
> reasons behind this, but only the results.  I also can't really say
> anything about C#, since I haven't programmed on that platform.

Have you actually measured these speed differences, or is this just a fuzzy
feeling you have?

Signature

Lew

Kenneth P. Turvey - 26 Mar 2008 04:41 GMT
> Have you actually measured these speed differences, or is this just a
> fuzzy feeling you have?

I haven't measured them, but they are clearly apparent in the algorithms
I've worked on.  Usually the Java program takes longer than the C program
by a multiple greater than 2.  

Now, if you program in Java in much the same way you would in C, that is
you don't create any objects, Java is actually faster much of the time.  
If however, you program in a way that is natural in Java and best
describes the algorithm you are implementing, you'll find that Java is
much slower than C.  

This should all be prefaced with, "In my experience.. ".

Signature

Kenneth P. Turvey <kt-usenet@squeakydolphin.com>

Mark Thornton - 26 Mar 2008 19:57 GMT
>> In big computational tasks you don't have idle moments and eventually
>> you do have to clean up.
[quoted text clipped - 3 lines]
> -- "C++ will be faster with object creation and destruction" -- would
> not be valid.

The original question related to "Number Crunching" which usually falls
into that special case.

Mark Thornton
Arne Vajhøj - 26 Mar 2008 00:16 GMT
>> C++ will be faster with object creation and destruction .. a lot faster
>> in my experience.  If you are going to be creating and destroying many
[quoted text clipped - 22 lines]
>
> Surely that's not actually the case?

Nope.

I think Java and .NET GC are very similar.

Arne
Mark Thornton - 26 Mar 2008 00:01 GMT
> On Tue, 25 Mar 2008 14:51:35 -0700, johnmortal.forums wrote:
>
[quoted text clipped - 10 lines]
> in my experience.  If you are going to be creating and destroying many
> objects the math won't matter.

It depends. Java can often be faster with multi threaded code --- the
standard C/C++ allocators have to use locking around every alloc/free
whereas Java allocators are often lock free even on multiprocessors. If
your storage structure (and object lifetime) is sufficiently complex
that your C++ code uses reference counting, then Java's garbage
collector can be a lot faster (reference counting with locking is
relatively slow).

Array access may be slower in Java if the JVM can't eliminate bounds
checks.

> Java is faster, or as fast in math on primitive types.  
>
> Java on Intel will be much slower than C on trig functions.  This problem
If your arguments are in the range +- PI/4 then the difference is not so
great. Larger arguments are slower, but then the result starts to
diverge as well (which may or may not matter to you). Try sin(PI).

What quality of C/C++ compiler do you have available? When I could last
be bothered to run tests it wasn't all that hard to beat Microsoft's
then current compiler. Intel's best was usually a bit in front.

Mark Thornton
Lew - 26 Mar 2008 02:54 GMT
> It depends. Java can often be faster with multi threaded code --- the
> standard C/C++ allocators have to use locking around every alloc/free
[quoted text clipped - 3 lines]
> collector can be a lot faster (reference counting with locking is
> relatively slow).

To which of Java's several garbage collectors does your comment apply?

Young generation collections in Java are very fast, influenced only by the
number of live objects; dead ones do not add to the GC time.

Signature

Lew

Mark Thornton - 26 Mar 2008 20:00 GMT
>> It depends. Java can often be faster with multi threaded code --- the
>> standard C/C++ allocators have to use locking around every alloc/free
[quoted text clipped - 8 lines]
> Young generation collections in Java are very fast, influenced only by
> the number of live objects; dead ones do not add to the GC time.

What seems to be forgotten is even if there are very few live objects
the GC will still take some minimum time. Thus there will be some cost
for every time you fill the young generation. The more frequently you
fill it, the higher the cost. This cost can be reduced by giving the
process a lot of memory, and in particular configuring a very large
young generation.

Mark Thornton
Arne Vajhøj - 26 Mar 2008 00:14 GMT
> C++ will be faster with object creation and destruction .. a lot faster
> in my experience.  If you are going to be creating and destroying many
> objects the math won't matter.

No.

All experience show that GC is more efficient than explicit
deallocation at the cost of poorer real time characteristics.

Arne
Kenneth P. Turvey - 26 Mar 2008 02:21 GMT
> All experience show that GC is more efficient than explicit deallocation
> at the cost of poorer real time characteristics.

That may be the consensus, but I know that in my experience (primarily
evolutionary computation and image processing) Java has not performed as
well as C when one starts to create many objects.  

There are many good reasons to choose Java, but performance isn't usually
one of them.  YMMV

Signature

Kenneth P. Turvey <kt-usenet@squeakydolphin.com>

Peter Duniho - 26 Mar 2008 02:31 GMT
> [...]
> There are many good reasons to choose Java, but performance isn't usually
> one of them.  YMMV

I agree that performance isn't usually the reason one chooses Java.  And  
especially in the context of this thread, I believe that's true ("number  
crunching").

However, there are actually valid performance-based reasons for using an  
environment like Java where a framework is provided.  It is often the case  
that application code spends very little time executing the code delivered  
with the application.  The API to which the application was written is  
where most of the execution is done, and if that API has a  
high-performance implementation, then one can often get better performance  
using that API than trying to write it oneself (especially for a given  
amount of effort).

I can't speak to any specific decision anyone's made along those lines  
with respect to Java (I'm far too inexperienced with Java to have any  
first-hand exposure to that sort of thing), but I have experience with  
other APIs in which counter-intuitively it improved performance to code to  
a framework/API that at first glance seems to add overhead.  Because the  
performance advantage from use thoroughly tested and optimized  
implementations of costly operations exceeds the overhead of whatever's  
required in order to use the framework/API, the net is a gain.

Again, I'm not sure any of that is relevant in this thread.  It's just  
that your comment brought it to mind, and I can't help but mention it.  :)

Pete
Arne Vajhøj - 26 Mar 2008 00:13 GMT
> This is the sort of question that I hope won't start an unhappy
> discussion, but I wanted to know whether there are any well accepted
[quoted text clipped - 12 lines]
> about why whatever they wrote is a bad test. Are there any accepted
> "good tests" on number crunching that have been run recently?

I am not aware of any general accepted tests.

In fact I doubt that it is possible to create such a test, because
what is a good test depends on the problem that needs to be solved.

Forget all the crap from mid-90's about Java being interpreted
and slow etc..

The JIT compiler used in modern JVM's are quite good.

That said, then I would still expect C/C++ to be slightly faster than
Java for your usage. Java checks array indexes - C/C++ does not. And
in general I doubt that sufficient time has been spent optimizing
floating point i JVM's. Floating point is not a big usage area
for Java. Fortran and C still dominates that area.

Whether you will be willing to spend time to track down various
memory overwrites and memory leaks in C/C++ to gain let us guess 10-20%
in performance is something you will have to decide on.

Arne
Mark Thornton - 26 Mar 2008 00:23 GMT
> Java for your usage. Java checks array indexes - C/C++ does not. And

for (int i=0; i<a.length; i++)
   ... a[i] ...

The server JVM will eliminate the bounds check in cases like this
(assuming 'a' isn't declared volatile). More generally whenever the loop
range is not changed within the loop.

> in general I doubt that sufficient time has been spent optimizing
> floating point i JVM's.
Nevertheless it is quite good at it. I think scalar SSE2 instructions
are used for example.

Mark
Patricia Shanahan - 26 Mar 2008 00:45 GMT
> This is the sort of question that I hope won't start an unhappy
> discussion, but I wanted to know whether there are any well accepted
[quoted text clipped - 12 lines]
> about why whatever they wrote is a bad test. Are there any accepted
> "good tests" on number crunching that have been run recently?

There is only one test that can accurately predict the performance of
your code - running your code.

Here's what I would do in your situation:

1. Extract from a few of your programs pieces of code that are
relatively small but take a high proportion of the run time.

2. Write programs around those pieces of code that set up typical test
data and check the results. These programs should also be dominated by
the code from step 1.

3. Re-implement a step 2 program in Java. Compare the new performance.
If it is good enough for your purposes, repeat for each of the programs.
If one of the jobs does not run well enough, rerun it on each new major
release of Java, but stick with C++.

If, on the other hand, Java does well enough on each of the tests, then
start writing some of your new programs in Java.

This procedure is not designed to answer some great absolute "Is Java
good for number crunching?" question. It is designed to answer the
question of whether you would get performance you like if you switched
to Java for the programs you are writing.

Patricia
northerntechie - 26 Mar 2008 23:44 GMT
I have not heard of any conversation regarding 'actual' process cycle
time.  Most tests I have seen compare start and end clocks, nothing is
accounted for in the kernel context switching, hardware interrupts,
and all that other jazz in the background.  There are a couple general
purpose hardware counters around (with a little kernel, possibly user
mode code manipulation) that can track actual cycle times.  All these
numbers are irrelevant without contextual and focused tests,
optimization aside.  Throw in a SMP based JVM GC, operating within my
process timeslice whenever my objects are all used up and my resources
are drained, and I would feel a little slow as well.

I think too much in the micro-controller world where multi-tasking is
rare, and if you run across it, it is only implemented by the hardy
few.

Todd Saharchuk, AScT.


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.