Java Forum / General / September 2007
Serizlize You Cannot Use for Object Size
moleskyca1@yahoo.com - 24 Aug 2007 21:50 GMT One thread here said you can serialize object and count serialized bytes to get the size of the object This is incorrect. I serialize small class with one boolean and 2 chars and got something crazy like 637 bytes for the size. I serialize to file and you see the problem:
¼φ ♣{sr java.io.NotSerializableException(Vx τå▬5☻ xr ↔java.io.ObjectStreamExce ptiond├Σkì9√▀☻ xr ‼java.io.IOExceptionlÇsde%≡½☻ xr ‼ java.lang.Exception╨²▼>
There is many bytes serialized so you can't use this to count size of object in bytes.
Manish Pandit - 24 Aug 2007 22:19 GMT On Aug 24, 1:50 pm, molesky...@yahoo.com wrote:
> One thread here said you can serialize object and count serialized > bytes to get the size of the object That is never the way to get object's in-memory size. What if the class declares all the fields as transient?
The output you pasted indicates that one or more of the attributes in the class that you serialized are non-serializable (do not implement java.io.Serializable).
-cheers, Manish
Roedy Green - 24 Aug 2007 23:03 GMT >One thread here said you can serialize object and count serialized >bytes to get the size of the object >This is incorrect. I serialize small class with one boolean and 2 >chars and >got something crazy like 637 bytes for the size. I serialize to file >and you see the problem: dump the stream to a file and have a look at it with a hex editor. A Serialized stream has quite a bit of overhead for the first object, namely the fully qualified names of all the types of all the fields used and the field names. Try dumping several objects and compare the streams. You will see the incremental size is quite reasonable. Further, you normally GZIP these streams. They compact nicely. See http://mindprod.com/applet/fileio.html for sample code.
Further objects pointed to and their descriptors go in the stream too. Often much more crud that you imagine gets dragged along. Check it out with a hex editor to make sure you have not inadvertently dragged along the kitchen sink.
 Signature Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Arne Vajhøj - 25 Aug 2007 00:40 GMT > One thread here said you can serialize object and count serialized > bytes to get the size of the object [quoted text clipped - 10 lines] > There is many bytes serialized so you can't use this to count size of > object in bytes. Correct.
Try something like:
public class SizeOf2 { private final static int N = 1000000; public static long mem() { System.gc(); Runtime rt = Runtime.getRuntime(); return rt.totalMemory() - rt.freeMemory(); } public static void main(String[] args) { long m1 = mem(); int[] ia = new int[N]; long m2 = mem(); System.out.println("sizeof int = " + (m2 - m1)*1.0/N); ia = null; long m3 = mem(); double[] xa = new double[N]; long m4 = mem(); System.out.println("sizeof double = " + (m4 - m3)*1.0/N); xa = null; } }
Arne
Manivannan Palanichamy - 25 Aug 2007 21:31 GMT On Aug 25, 1:50 am, molesky...@yahoo.com wrote:
> One thread here said you can serialize object and count serialized > bytes to get the size of the object [quoted text clipped - 10 lines] > There is many bytes serialized so you can't use this to count size of > object in bytes. First of all, what is meant by Object's size? This is not C/ C++. In C, C++ the object size is calculated by summing up the member variables. But, in java, it depends on implementation. For example, the java language specification just says that the boolean should take either 'true' or 'false'. But, it doesnt force any constraints on the implementation like, the boolean size should be 1 or 10 bytes. Some implementation might represent a boolean variable in 1 single byte. Some other implementation may represent the boolean varibale in 2 bytes or more. So, size is all about implementation specific.
One more thing, the 'Object Construction' is also implementation specific. Assume, you declare 5 integers in a serialized class. So, you think that the object instance size for the class will be (5 * 4) 20 bytes. But, that cant be the case always. Because, jvm might add some internal fields to represent the 'serialized' feature. Or it might do some trick over constructing the particular instance. So, no guarantee that your measured 'size' will be accurate.
I would suggest not to talk about *size* in java. Talk about memory.
-- Manivannan Palanichamy (@) Oracle.com http://mani.gw.googlepages.com/index.html
moleskyca1@yahoo.com - 26 Aug 2007 16:15 GMT On Aug 25, 4:31 pm, Manivannan Palanichamy <manivannan.palanich...@gmail.com> wrote:
> On Aug 25, 1:50 am, molesky...@yahoo.com wrote: > [quoted text clipped - 35 lines] > -- > Manivannan Palanichamy (@) Oracle.comhttp://mani.gw.googlepages.com/index.html So what is the way to compute the memory consumed by object? This is hard to work with language that doesn't support this. Can anyone post some code that work? Say you have this class what will is total memory for each instance:
public class Goo implements Serializable { public int one; public boolean two; public boolean three; public double x;
}
This is something that programmer should be able to do on any language. I cannot do it in java, but i am new. Can anyone do it?
Lew - 26 Aug 2007 16:43 GMT > So what is the way to compute the memory consumed by object? This is > hard to work with language that doesn't support this. Can anyone post [quoted text clipped - 11 lines] > This is something that programmer should be able to do on any > language. I cannot do it in java, but i am new. Can anyone do it? AFAIK there is no general answer to "how large is an object" in Java, unless one explicitly accounts for the time element.
"Memory consumed" makes most sense in a runtime context. Others have alluded to the difficulty, for example, of correlating the size of a serialized representation to any runtime impact. Let's grant that what we care about is amount of memory consumed by an instance at runtime.
But runtime is an interval - a program is loaded, runs for a while then ends. The envelope of that varies according to the complexity of the program, its usage patterns, whether it's a server process and so on. During that interval, the shape of a program varies wildly due to Java's dynamic nature.
Even individual objects of a class could be implemented differently at different times during runtime. For that matter, the same instance can change its memory footprint during its lifetime. Are you interested in the instantaneous memory footprint, the mean memory consumption, its maximum? Over a single instance's lifetime or aggregated for the lifetime of the class?
For that matter, the class itself might be garbage collected altogether during the program's run. If it's something used only during program initialization, it might have an instantaneous footprint that is egregious but has no negative impact on the program's performance during normal operation after it's been collected. Even during the init phase, hotspotting might inline the whole thing and it would essentially disappear even while in use.
These factors make it difficult to give any kind of simple answer to your question.
 Signature Lew
Lasse Reichstein Nielsen - 26 Aug 2007 17:01 GMT > So what is the way to compute the memory consumed by object? This is > hard to work with language that doesn't support this. No it's not. I have yet to need it for anything in Java.
It all depends on how you think about memory. In C, you need to count bytes and do pointer arithmetic. You need to know how many bytes you use, because you are doing memory management manually. In Java, you don't care about the exact size of an object. You care about how many objects you create and how long they live, but whether an object has 4 or 8 bytes of overhead is completely irrelevant.
> Can anyone post some code that work? Say you have this class what > will is total memory for each instance: [quoted text clipped - 5 lines] > public double x; > } As others have said, neither the Java Language specification or the Java Virtual Machine specifications give requirements on how large object implementations must be. Different JVM implementations can, and probably do, differ.
> This is something that programmer should be able to do on any > language. I cannot do it in java, but i am new. Can anyone do it? Can you say what you need it for? Curiosity is fine, but any algorithm that cares about the physical size of an object, i.e., deals with objects on the byte level, is likely to be less portable than one that deals with objects at the object level.
/L
 Signature Lasse Reichstein Nielsen - lrn@hotpop.com DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html> 'Faith without judgement merely degrades the spirit divine.'
Patricia Shanahan - 26 Aug 2007 17:29 GMT >> So what is the way to compute the memory consumed by object? This is >> hard to work with language that doesn't support this. [quoted text clipped - 7 lines] > objects you create and how long they live, but whether an object has 4 > or 8 bytes of overhead is completely irrelevant. Huh? Here's a specific problem. Suppose I have an application that uses a lot of memory. The size of a problem can be expressed in terms of a few parameters. I know the numbers of certain types of objects that will be created, as functions of those problem size parameters.
For simplicity, let's assume a single basic size parameter N. However, for real problems there may be more size parameters.
I would like to run a problem with N=10,000. I know, by experiment, that it does not run on any machine to which I currently have access. If I ask my academic adviser (or my manager if I were working in industry) for access to a bigger memory, the inevitable question is "How big?".
How should I go about answering that question, without caring about object sizes?
Patricia
Lasse Reichstein Nielsen - 26 Aug 2007 18:21 GMT >> In Java, you don't care about the exact size of an object. You care >> about how many objects you create and how long they live, but >> whether an object has 4 or 8 bytes of overhead is completely >> irrelevant.
> Huh? Here's a specific problem. Suppose I have an application that uses > a lot of memory. The size of a problem can be expressed in terms of a [quoted text clipped - 8 lines] > ask my academic adviser (or my manager if I were working in industry) > for access to a bigger memory, the inevitable question is "How big?". I admit I was overgeneralizing. However, I don't think it was by much :)
So the "How big?" question is a good one, but hard to answer. If you know that on the current version of the VM, OS and hardware you plan to use, the objects takes exactly 24 bytes, how much memory will you need then? What if it was 32 bytes?
If 99% of the heap will be occupied by the data objects, which will all stay live during the entire computation, then the poor garbage collector won't have much to do. If not, you need to know also how many objects are alive at a time (this might not be linear in the problem size). You should consider how a generational garbage collector interacts with all these objects, and tweak VM parameters to match.
If you are cutting it so close that the overhead of the object implementation counts, then I would go for a language with manual memory management instead of Java. I wouldn't rely on automatic, garbage collected, memory management for something with so much simultaneous live data. I'm certain not everybody agrees with such heresy :)
> How should I go about answering that question, without caring about > object sizes? Object sizes alone won't be enough. It's the path of bit-fiddling and platform-specific tweaking to get there, and then Java has already lost much of its advantage.
/L
 Signature Lasse Reichstein Nielsen - lrn@hotpop.com DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html> 'Faith without judgement merely degrades the spirit divine.'
Patricia Shanahan - 26 Aug 2007 18:27 GMT >>> In Java, you don't care about the exact size of an object. You care >>> about how many objects you create and how long they live, but [quoted text clipped - 20 lines] > plan to use, the objects takes exactly 24 bytes, how much memory will > you need then? What if it was 32 bytes? The technique I would use to answer the question is to measure, on the JVM I intend to use but on a system to which I have access, the in-use memory (totalMemory() - freeMemory() immediately after a System.gc() call) with controlled numbers of objects in existence. Given that result, and the relationships between problem size and object creation, I can project out the memory for larger problem sizes.
> If you are cutting it so close that the overhead of the object > implementation counts, then I would go for a language with manual > memory management instead of Java. I wouldn't rely on automatic, > garbage collected, memory management for something with so much > simultaneous live data. > I'm certain not everybody agrees with such heresy :) There is too much that is convenient about Java for me to throw it out just because memory size estimation requires some effort. Note that I have seen far more consistency than you seem to expect in things like array and object overhead. Of course, switching to a 64 bit JVM does make a difference.
Patricia
Arne Vajhøj - 26 Aug 2007 19:31 GMT > The technique I would use to answer the question is to measure, on the > JVM I intend to use but on a system to which I have access, the in-use > memory (totalMemory() - freeMemory() immediately after a System.gc() > call) with controlled numbers of objects in existence. Given that > result, and the relationships between problem size and object creation, > I can project out the memory for larger problem sizes. And I actually posted a code snippet doing so yesterday.
Arne
Arne Vajhøj - 26 Aug 2007 19:30 GMT > So what is the way to compute the memory consumed by object? This is > hard to work with language that doesn't support this. Can anyone post [quoted text clipped - 11 lines] > This is something that programmer should be able to do on any > language. I cannot do it in java, but i am new. Can anyone do it? I posted a solution yesterday.
Arne
Twisted - 31 Aug 2007 18:21 GMT On Aug 24, 4:50 pm, molesky...@yahoo.com wrote:
> One thread here said you can serialize object and count serialized > bytes to get the size of the object > This is incorrect. I serialize small class with one boolean and 2 > chars and > got something crazy like 637 bytes for the size. This has all kinds of problems.
First, objects may be larger in memory than serialized, due to transient fields. As others noted, measuring size in memory is best done by something like
Runtime rt = Runtime.getRuntime(); System.gc(); int usage = rt.totalMemory() - rt.freeMemory(); MyObject myObject = new MyObject(args); System.gc(); int size = rt.totalMemory() - rt.freeMemory() - usage;
This still will vary from VM to VM, and may not work perfectly (System.gc() doesn't guarantee the gc runs, so it can err high if transient objects are made and discarded by the MyObject constructor and the second System.gc() does nothing, and it can err low if the first System.gc() does nothing and the second does and collects some garbage).
Measuring the size of serialized objects can be done more reliably, since for an identical object it will be identical on all platforms given the same version of the object's class. It may vary from instance to instance depending on what objects it references or contains via its member variables though. Still it will give you an idea of how much disk space or bandwidth serialized instances will consume in bulk.
But serialized output contains overhead; this will be most of your 637 bytes. I'd serialize an N-element array of MyObjects and an N+1- element array of MyObjects, both with every array cell containing a MyObject (rather than null), and look at the difference in their file sizes. I'd make the fields that would tend to reference shared objects reference a single instance from all these MyObjects so that their referents "don't count" in the final calculation, and the fields that would tend to reference "owned" objects or "contained" ones reference separate ones for each instance so that their referents do count. This will give the best idea of scaling behavior when a large number of MyObjects are serialized on a single stream. Your original figure of 637 bytes is, on the other hand, unfortunately exactly what you can expect if each one is serialized on a separate stream.
Note that the suggested method of measurement should end up summing the MyObject size as the size of its fields, with fields of reference type being the size of a pointer or some equivalent, plus the size of the objects an instance references with such fields and actually "owns".
Roedy Green - 01 Sep 2007 13:02 GMT >One thread here said you can serialize object and count serialized >bytes to get the size of the object see http://mindprod.com/jgloss/sizeof.html
 Signature Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|