Java Forum / General / June 2007
C++ vs Java "new" (no flame war please!)
mlw - 01 Apr 2007 04:26 GMT Do not take anything about this, it is not a flame or troll, while I'm not new to Java I favor C++. However, I may need to use it in a contract position, and am concerned that the restrictions it places on application code.
Take, for instance, this C++ construct:
class foo { char *m_name; .... void * operator new(size_t size, char *string); }
void *foo::new(size_t size, char *string) { size_t cbstr = strlen(string)+1; size_t cb = cbstr + size; foo * t = (foo *) malloc(cb); char * name = (char *) &t[1]; strcpy(name, string); t->m_name = name; }
The above example is a methodology that can be used to reduce the CPU and memory overhead of malloc. You my argue that this is not a valid concern, but if you have 10 or 100 million objects, the malloc block overhead, alone, make this worth while. Hint: This is actually a simplification, some times malloc is not used at all, and a big array is pre-alocated and work through it with each new.
Is there a way to create 10 to 100 million objects in Java with a reasonable system configuration?
Lew - 01 Apr 2007 19:05 GMT > (snipped description of custom 'new' operator)
> Is there a way to create 10 to 100 million objects in Java with a reasonable > system configuration? Sure, given the same considerations of available heap that you would have in the C++ world.
Let's assume you want to create N objects where N is the largest number of such objects that would fit in available memory.
List <Foo> stuff = new ArrayList <Foo> (N); for ( int x = 0; x < N; ++x ) { stuff.add( new Foo( getAString() ) ); } doSomethingWith( stuff );
If you don't need them all in memory at once, it's even easier:
for ( int x = 0; x < N; ++x ) { Foo foo = new Foo( getAString() ); doSomethingWith( foo ); }
I posit the 'getAString()' method as where you'd obtain the equivalent of the 'char * string' in the C++ example. I assumed you'd use a different string for each instance of Foo.
-- Lew
mlw - 01 Apr 2007 21:23 GMT > > (snipped description of custom 'new' operator) > [quoted text clipped - 3 lines] > Sure, given the same considerations of available heap that you would have > in the C++ world. Give or take, I guess.
> Let's assume you want to create N objects where N is the largest number of > such objects that would fit in available memory. > > List <Foo> stuff = new ArrayList <Foo> (N); That's one memory alloc, sure.
> for ( int x = 0; x < N; ++x ) > { > stuff.add( new Foo( getAString() ) ); > } The above code is exactly my problem with Java. In C++ I can overload new and put all the objects anywhere I want, even in to one contiguous memory block calling malloc merely once and combining the object and the string in one allocation, for instance:
(Remember this is a simplification for example purposes, but the technique is the important thing.)
unsigned char * myshared_block = shared_alloc(MAX_SIZE) size_t curr_offset=0;
void *foo::operator new(size_t size, char * str) { size_t cbstr = strlen(str)+1; size_t cb = size + cbstr;
foo * fooT = (foo *) &myshared_block[curr_offset]; curr_offset += cb; char *pstr = (char *)&foo[1]; strcpy(pstr, str); fooT->str = pstr; return (void *) fooT; }
In the above code, I can pre-allocate a single memory block and pull objects out of it until it is empty.
Every memory allocation has overhead, in GCC malloc, it is probably 4 bytes, 8 bytes on 64 bit systems. So, if you have fairly small objects, and lots of them, a good chunk of memory will be eaten up with malloc overhead. If you have a small object with a string, you will have two memory allocations!
Obviously this is a rare problem, but it is a problem none the less.
> doSomethingWith( stuff );
> If you don't need them all in memory at once, it's even easier: > [quoted text clipped - 7 lines] > the 'char * string' in the C++ example. I assumed you'd use a different > string for each instance of Foo. Lew - 02 Apr 2007 06:59 GMT Lew wrote:
>> Let's assume you want to create N objects where N is the largest number of >> such objects that would fit in available memory. >> >> List <Foo> stuff = new ArrayList <Foo> (N);
> That's one memory alloc, sure. Lew wrote:
>> for ( int x = 0; x < N; ++x ) >> { >> stuff.add( new Foo( getAString() ) ); >> }
> The above code is exactly my problem with Java. In C++ I can overload new > and put all the objects anywhere I want, even in to one contiguous memory > block calling malloc merely once and combining the object and the string in > one allocation, You still need to calculate offsets into that block to fix the start of each individual object. In fact, the code in your custom allocator uses more memory and much more time than would the JVM for the equivalent Java class.
> Every memory allocation has overhead, in GCC malloc, it is probably 4 bytes, > 8 bytes on 64 bit systems. So, if you have fairly small objects, and lots > of them, a good chunk of memory will be eaten up with malloc overhead. If > you have a small object with a string, you will have two memory > allocations! Your C++ class did a copy of the string. A Java class likely would not, since Strings are immutable, so it would only do one allocation for the Foo object and none for the String. Even if one did copy the String, the time overhead of the Java allocation and copy would be much less than for the C++ code you showed. For one thing, the Java code would only loop through the String once, not twice as in your C++ code; it would have no need to calculate "strlen()". The memory overhead would be no different since a copy is a copy is a copy.
But as I said, the Java code would not copy the String, so the point is moot. One allocation and less memory overhead in the Java version.
> Obviously this is a rare problem, but it is a problem none the less. I don't see what the problem with Java is. What is the bad effect that concerns you with Java?
Is it memory overhead? Objects in Java take up only as much space as they take.
Is it time overhead? Java object allocations run on the order of 10-20 machine cycles. Initialization of the object takes some time, perhaps, but that would be true with your custom allocator as well.
Your description seems to delineate a problem with C++ that your custom allocator handles, but I see nothing of this relevant to Java.
-- Lew
mlw - 02 Apr 2007 17:49 GMT > Lew wrote: >>> Let's assume you want to create N objects where N is the largest number [quoted text clipped - 20 lines] > memory and much more time than would the JVM for the equivalent Java > class. That is simply not true on either account, I don't need to "fix" the start of anything. There is no memory overhead for each class. The "allocation" overhead time is nothing more than a simple integer addition.
>> Every memory allocation has overhead, in GCC malloc, it is probably 4 >> bytes, 8 bytes on 64 bit systems. So, if you have fairly small objects, [quoted text clipped - 3 lines] > > Your C++ class did a copy of the string. Yes.
> A Java class likely would not, > since Strings are immutable, so it would only do one allocation for the > Foo object > and none for the String. Well, not seen in the example would be that the string is read from a file into a local buffer which gets reused.
> Even if one did copy the String, the time > overhead of the Java allocation and copy would be much less than for the > C++ code you > showed. How is this possible?
char buffer[MAX_SIZE]
while(!feof(f)) { if(fgets(buffer, sizeof(buffer), f)) { new(buffer) foo(); } }
Where is the allocation overhead that you are referring too?
> For one thing, the Java code would only loop through the String > once, not twice as in your C++ code; it would have no need to calculate > "strlen()". > The memory overhead would be no different since a copy is a copy is a > copy. Like I said, I was showing a technique that was simplified for example, don't nit-pick the example because it is obviously much less sophisticated then the actual application that uses the technique. In the real code, the new operator is much more complex and reads the data itself.
> But as I said, the Java code would not copy the String, so the point is > moot. One allocation and less memory overhead in the Java version. Java would allocate a new string for each string. My code does not issue any memory allocation for the string, it comes from a fixed length block and it is combined with the object proper.
>> Obviously this is a rare problem, but it is a problem none the less. > [quoted text clipped - 3 lines] > Is it memory overhead? Objects in Java take up only as much space as they > take. Try this:
allocate 20,000,000 objects with each object containing at least one string. The memory footprint of the application has AT LEAST 160 megabytes of overhead that can be virtually eliminated by pre-allocating 4 megabyte blocks and sub-allocating out of that.
> Is it time overhead? Java object allocations run on the order of 10-20 > machine cycles. Initialization of the object takes some time, perhaps, > but that would be true with your custom allocator as well. Java allocations take 10-20 machine cycles? Not on your life. In JIT compiled code in which objects lose scope on exit of the function, yes, that may be the overall time (it could even be much less), but objects that live beyond a function scope have the overhead of a memory allocation.
> Your description seems to delineate a problem with C++ that your custom > allocator handles, but I see nothing of this relevant to Java. Java:
class foo { String m_test;
foo(String value) { m_test = value; } void print() { System.out.println(m_test+"\n"); } }; public class test { public static void main(java.lang.String[] args) { int i=0; try { foo arr[] = new foo[2000000]; for(i=0; i < 2000000; i++) arr[i] = new foo(Integer.toHexString(i)); } catch(OutOfMemoryError e) { System.out.println(e.getMessage()); System.out.println("Total object:" + i); } } };
C++ #include <stdlib.h> #include <unistd.h> #include <assert.h> #include <string.h> #include <stdio.h>
class foo { char *m_test;
public: foo(); void * operator new(size_t size, void *p, size_t cb); };
#define BLKSIZE 1024*1024
unsigned char *block = NULL; size_t blk_size=0; size_t blk_offset=0;
foo::foo() { } void *foo::operator new(size_t size, void *p, size_t cb) { size_t totalcb = size+cb;
if(!block || totalcb > (blk_size - blk_offset)) { block = (unsigned char *) malloc(BLKSIZE); blk_size = BLKSIZE; blk_offset = 0; } assert(block);
foo *fooT = (foo *) &block[blk_offset]; blk_offset += size; fooT->m_test = (char *) &block[blk_offset]; blk_offset += cb; memcpy(fooT->m_test, p, cb); return (void *) fooT; }
int main() { int i=0; foo ** ar = (foo **) malloc(sizeof(foo*) * 2000000);
for(i=0; i < 2000000; i++) { char buffer[64]; size_t cb = snprintf(buffer,sizeof(buffer), "%X", i)+1; ar[i] = new((void *)buffer,cb) foo(); } printf("%d\n", i); }
The results: test@localhost:~/scat$ time java test Java heap space Total object:914828
real 0m23.519s user 0m21.009s sys 0m2.004s test@localhost:~/scat$ time ./test 2000000
real 0m0.838s user 0m0.724s sys 0m0.064s
Lew - 02 Apr 2007 18:19 GMT > That is simply not true on either account, I don't need to "fix" the start > of anything. There is no memory overhead for each class. The "allocation" > overhead time is nothing more than a simple integer addition. It is that addition to which I refer.
Lew wrote:
>> Even if one did copy the String, the time >> overhead of the Java allocation and copy would be much less than for the >> C++ code you >> showed.
> How is this possible? I was referring to your code:
> void *foo::new(size_t size, char *string) > { [quoted text clipped - 5 lines] > t->m_name = name; > } strlen() has overhead and strcpy() has overhead.
> Well, not seen in the example would be that the string is read from a file > into a local buffer which gets reused. In Java, it would be read into a new buffer each time, the allocation overhead of which is negligible, and the copy from the file into the buffer would be the only copy. There would not be the copy represented in your code by the strcpy() call. The small overhead of the new buffer allocation is well offset by the savings in the string copy time.
> char buffer[MAX_SIZE] > [quoted text clipped - 7 lines] > > Where is the allocation overhead that you are referring too? in the new operator that you wrote, particularly in the strlen() and strcpy() calls, which add to the time and memory footprints.
> Java would allocate a new string for each string. My code does not issue any > memory allocation for the string, it comes from a fixed length block and it > is combined with the object proper. But the Java code would have one less copy of that string, and that would save copy time. The fact that the memory is allocated at new time in Java is about the same as the memory add at new time in your C++ example. A Java object allocation is not much more than a memory limit add.
> arr[i] = new foo(Integer.toHexString(i)); > > size_t cb = snprintf(buffer,sizeof(buffer), "%X", i)+1; You might be comparing the speed of toHexString() to that of snprintf(), and not allocation times.
You also are not comparing Java with custom allocators to Java without custom allocators. I am not arguing that Java is faster than C++, only that custom allocators in Java would not help the Java performance.
With Java you have the overhead of bytecode interpretation and multiple threads running in the JVM at the same time, plus there is the startup time of the JVM itself that you did not factor out. Consequently you have not measured memory allocation time vs. memory allocation time and we can draw no conclusions about the relative efficiency of the two schemes.
Put your timing loops inside the program, and while you're at it give the Java Hotspot compiler a few hundreds of loops to settle in. Oh, and try java -client vs. java -server. You've got to factor out the overhead of setup and so on before timing comparisons make sense.
-- Lew
mlw - 02 Apr 2007 19:18 GMT >> That is simply not true on either account, I don't need to "fix" the >> start of anything. There is no memory overhead for each class. The [quoted text clipped - 23 lines] > > strlen() has overhead and strcpy() has overhead. Only CPU overhead, and like I said, this is a simplification.
>> Well, not seen in the example would be that the string is read from a >> file into a local buffer which gets reused. [quoted text clipped - 21 lines] > in the new operator that you wrote, particularly in the strlen() and > strcpy() calls, which add to the time and memory footprints. strlen and strcpy never call malloc. The function strdup does call malloc.
>> Java would allocate a new string for each string. My code does not issue >> any memory allocation for the string, it comes from a fixed length block [quoted text clipped - 13 lines] > You might be comparing the speed of toHexString() to that of snprintf(), > and not allocation times. I think you are very confused about memory allocation in Java, it is not trivial when it exceeds certain limits or exists outside function scope. Just because you have no control over how Java does this, does not mean it is something you don't need to know about.
When I change the lines to : arr[i] = new foo("Test:"+i); and size_t cb = snprintf(buffer,sizeof(buffer), "Test:%X", i)+1;
In the Java and C++ code, respectively, I get pretty much the same results: test@localhost:~/scat$ time ./test 2000000
real 0m1.020s user 0m0.860s sys 0m0.092s test@localhost:~/scat$ time java test Java heap space Total object:741838
real 0m19.178s user 0m16.893s sys 0m1.728s
Lew - 03 Apr 2007 00:02 GMT > real 0m1.020s > user 0m0.860s [quoted text clipped - 6 lines] > user 0m16.893s > sys 0m1.728s You're still not timing memory allocation but JVM startup and other factors.
-- Lew
mlw - 03 Apr 2007 00:33 GMT >> real 0m1.020s >> user 0m0.860s [quoted text clipped - 9 lines] > You're still not timing memory allocation but JVM startup and other > factors. Start up is NOT 18 seconds, so what other factors could we possibly be talking about?
Lew - 03 Apr 2007 03:13 GMT >>> real 0m1.020s >>> user 0m0.860s [quoted text clipped - 11 lines] > Start up is NOT 18 seconds, so what other factors could we possibly be > talking about? Beats me.
-- Lew
ITMozart - 30 Jun 2007 04:03 GMT >>>> real 0m1.020s >>>> user 0m0.860s [quoted text clipped - 15 lines] > > -- Lew Actually it doesn't beat you. Instead, it beats people which have enough inexperience in a matter, to publish microbenchmarks in that specific matter.
First law of microbenchmarks: microbenchmarks are flawed and misleading by definition. Corollary: Heisenbenchmark principle.
The following code demonstrate one flaw of the presented MB, regarding the speed analysis. There is [at least] another big flaw in the comparison, guess which.
Also, if we want to play the "speed freaks" game, it is possible, via some simple and clean optimizations, to obtain about 5x speed and 1.5x objects total at the same time.
/* --- */
class foo { private String m_test; foo(String value) { m_test = value; } void print() { System.out.println(m_test); } };
public class LargeAllocation { private static final int TOT_OBJS = 2000 * 1000; /* takes 8 secs for 914733 objects, after which the JVM explodes */ // 900 * 1000; // FLAW: should take few less than 8 secs, right?
private static long start, end;
public static void main(String[] args) { start = System.currentTimeMillis();
allocate(TOT_OBJS);
end = System.currentTimeMillis(); System.out.println("Time: " + (end - start)); }
private static void allocate(int numObjs) { int i = 0; try { foo arr[] = new foo[numObjs]; for (i = 0; i < numObjs; i++) arr[i] = new foo(Integer.toHexString(i)); } catch (OutOfMemoryError e) { System.out.println("Error: " + e.getMessage()); System.out.println("Total objects: " + i); } } };
Bye! IM
Mark Rafn - 02 Apr 2007 16:18 GMT >Take, for instance, this C++ construct: >void *foo::new(size_t size, char *string) [quoted text clipped - 6 lines] > t->m_name = name; >}
>The above example is a methodology that can be used to reduce the CPU and >memory overhead of malloc. Trying not to flame C++ here, but I'm glad not to see very much of this kind of code in Java.
>You my argue that this is not a valid concern In some apps, it is. If you're extremely sensitive to exact memory allocation, or need hardware access that Java doesn't give you (say, to SysV shared memory or something), C++ is a good choice. I'd generally recommend to do the specific sensitive bit in C++ and the rest in Java, but it'll depend entirely on what the app actually does.
>Is there a way to create 10 to 100 million objects in Java with a reasonable >system configuration? Any VM since 1.4 on modern hardware should not have a problem with this, unless you're pretty time-sensitive. -- Mark Rafn dagon@dagon.net <http://www.dagon.net/>
mlw - 02 Apr 2007 19:30 GMT >>Take, for instance, this C++ construct: >>void *foo::new(size_t size, char *string) [quoted text clipped - 12 lines] > Trying not to flame C++ here, but I'm glad not to see very much of this > kind of code in Java. Its not pretty, not at all, but it is the sort of code that can make the difference between running or not running when necessary or withing the time requirements. Fortunately, you can write it, test it, make sure it works, then hide it in a library where newbees can't screw around and break it.
>>You my argue that this is not a valid concern > [quoted text clipped - 4 lines] > recommend to do the specific sensitive bit in C++ and the rest in Java, > but it'll depend entirely on what the app actually does. That's sort of why I posted the thread. I may need to do some "interesting" things in Java and was kind of looking to see if anyone could come up with a good trick or two.
>>Is there a way to create 10 to 100 million objects in Java with a >>reasonable system configuration? > > Any VM since 1.4 on modern hardware should not have a problem with this, > unless you're pretty time-sensitive. To address the "time-sensitive" comment, many times I hear that the machines are fast enough that you don't need to worry about performance, I have to say this is the same argument for the 4 day work week. "Soon, we'll be productive enough that we'll only have to work 4 days a week." That was the dream and the myth. The problem with that line of thought is that as capability is increased, so are expectations. I don't know about you, but the 4 day work week hasn't come to my corner of the globe.
If you make a program that takes an hour to do something, and someone comes along and writes one that takes 5 minutes, you lose, no matter what the program is written in.
Mark Rafn - 03 Apr 2007 00:59 GMT >>>Is there a way to create 10 to 100 million objects in Java with a >>>reasonable system configuration?
>> Any VM since 1.4 on modern hardware should not have a problem with this, >> unless you're pretty time-sensitive.
>To address the "time-sensitive" comment, many times I hear that the machines >are fast enough that you don't need to worry about performance, I don't think I'd ever say that. I will say that machines are fast and cheap enough that for a whole lot of uses you can optimize for clarity, maintainability, and developer time over every last cycle of performance. It's definitely still a tradeoff, just one that's shifted quite a ways.
>I have to say this is the same argument for the 4 day work week. "Soon, >we'll be productive enough that we'll only have to work 4 days a week." I'm productive enough to earn far more than 125% of what I did a few years ago. No myth here.
>If you make a program that takes an hour to do something, and someone comes >along and writes one that takes 5 minutes, you lose, no matter what the >program is written in. Depends on the program. I have programs that run for an hour each day, that could probably be cut down to 30 minutes (the difference would be changes in DB usage and transactional consistency, not just reimplement in a different language), but I have better things to worry about.
The VAST majority of programs aren't 12:1 difference. If you write a program that takes 11 seconds to do something, and someone comes along and writes one that does it in 10.6, but only on some platforms and in a hard-to-maintain way, I think they lose. -- Mark Rafn dagon@dagon.net <http://www.dagon.net/>
ITMozart - 30 Jun 2007 01:11 GMT > Is there a way to create 10 to 100 million objects in Java with a reasonable > system configuration? A question: this routine and the 10/100 million objects refer to a real-world case or are just speculation? What exactly the application do, in the overall?
I think that many of the speed/size speculation around are just applied to micro-benchmarks, and that there is an absolute lack of real-world benchmarks (entire applications converted from one language to another).
Micro-benchmarks haven't any real scientific value. But they are still diffused because they're easy to develop. It's very curious to me the fact that nobody would think about comparing two F1-cars just by benchmarking their tires (!?), but the same approach with programming languages implementations is commonly conceived as perfectly reasonable and accurate by the programming community.
Quake 2 has been written in C by a absolute-history programming master (John Carmack), in C and assembler. Jake 2, the Java conversion, is 85% faster. This is impressive.
I stress that this is the only real-world benchmark which I've EVER seen, and I would be EXTREMELY interested in real-world benchmarks, but I still can't found any of them.
The entire story remembers me of an article written by Tom Miller:
http://msdn.microsoft.com/msdnmag/issues/05/08/EndBracket/default.aspx
Bye! IM
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|