Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / September 2007

Tip: Looking for answers? Try searching our database.

Serizlize You Cannot Use for Object Size

Thread view: 
moleskyca1@yahoo.com - 24 Aug 2007 21:50 GMT
One thread here said you can serialize object and count serialized
bytes to get the size of the object
This is incorrect. I serialize small class with one boolean and 2
chars and
got something crazy like 637 bytes for the size. I serialize to file
and you see the problem:

¼φ ♣{sr  java.io.NotSerializableException(Vx τå▬5☻  xr
↔java.io.ObjectStreamExce
ptiond├Σkì9√▀☻  xr ‼java.io.IOExceptionlÇsde%≡½☻  xr ‼
java.lang.Exception╨²▼>

There is many bytes serialized so you can't use this to count size of
object in bytes.
Manish Pandit - 24 Aug 2007 22:19 GMT
On Aug 24, 1:50 pm, molesky...@yahoo.com wrote:
> One thread here said you can serialize object and count serialized
> bytes to get the size of the object

That is never the way to get object's in-memory size. What if the
class declares all the fields as transient?

The output you pasted indicates that one or more of the attributes in
the class that you serialized are non-serializable (do not implement
java.io.Serializable).

-cheers,
Manish
Roedy Green - 24 Aug 2007 23:03 GMT
>One thread here said you can serialize object and count serialized
>bytes to get the size of the object
>This is incorrect. I serialize small class with one boolean and 2
>chars and
>got something crazy like 637 bytes for the size. I serialize to file
>and you see the problem:

dump the stream to a file and have a look at it with a hex editor. A
Serialized stream has quite a bit of overhead for the first object,
namely the fully qualified names of all the types of all the fields
used and the field names. Try dumping several objects and compare the
streams. You will see the incremental size is quite reasonable.
Further, you normally GZIP these streams. They compact nicely.  See
http://mindprod.com/applet/fileio.html
for sample code.

Further objects pointed to and their descriptors go in the stream too.
Often much more crud that you imagine gets dragged along.  Check it
out with a hex editor to make sure you have not inadvertently dragged
along the kitchen sink.
Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Arne Vajhøj - 25 Aug 2007 00:40 GMT
> One thread here said you can serialize object and count serialized
> bytes to get the size of the object
[quoted text clipped - 10 lines]
> There is many bytes serialized so you can't use this to count size of
> object in bytes.

Correct.

Try something like:

public class SizeOf2 {
    private final static int N = 1000000;
    public static long mem() {
        System.gc();
        Runtime rt = Runtime.getRuntime();
        return rt.totalMemory() - rt.freeMemory();
    }
    public static void main(String[] args) {
        long m1 = mem();
        int[] ia = new int[N];
        long m2 = mem();
        System.out.println("sizeof int = " + (m2 - m1)*1.0/N);
    ia = null;
        long m3 = mem();
        double[] xa = new double[N];
        long m4 = mem();
        System.out.println("sizeof double = " + (m4 - m3)*1.0/N);
    xa = null;
    }
}

Arne
Manivannan Palanichamy - 25 Aug 2007 21:31 GMT
On Aug 25, 1:50 am, molesky...@yahoo.com wrote:
> One thread here said you can serialize object and count serialized
> bytes to get the size of the object
[quoted text clipped - 10 lines]
> There is many bytes serialized so you can't use this to count size of
> object in bytes.

First of all, what is meant by Object's size? This is not C/ C++. In
C, C++ the object size is calculated by summing up the member
variables. But, in java, it depends on implementation. For example,
the java language specification just says that the boolean should take
either 'true' or 'false'. But, it doesnt force any constraints on the
implementation like, the boolean size should be 1 or 10 bytes. Some
implementation might represent a boolean variable in 1 single byte.
Some other implementation may represent the boolean varibale in 2
bytes or more. So, size is all about implementation specific.

One more thing, the 'Object Construction' is also implementation
specific. Assume, you declare 5 integers in a serialized class. So,
you think that the object instance size for the class will be (5 * 4)
20 bytes. But, that cant be the case always. Because, jvm might add
some internal fields to represent the 'serialized' feature. Or it
might do some trick over constructing the particular instance. So, no
guarantee that your measured 'size' will be accurate.

I would suggest not to talk about *size* in java. Talk about memory.

--
Manivannan Palanichamy (@) Oracle.com
http://mani.gw.googlepages.com/index.html
moleskyca1@yahoo.com - 26 Aug 2007 16:15 GMT
On Aug 25, 4:31 pm, Manivannan Palanichamy
<manivannan.palanich...@gmail.com> wrote:
> On Aug 25, 1:50 am, molesky...@yahoo.com wrote:
>
[quoted text clipped - 35 lines]
> --
> Manivannan Palanichamy (@) Oracle.comhttp://mani.gw.googlepages.com/index.html

So what is the way to compute the memory consumed by object? This is
hard to work with language that doesn't support this. Can anyone post
some code that work? Say you have this class what will is total memory
for each instance:

public class Goo implements Serializable {
 public int one;
 public boolean two;
 public boolean three;
 public double x;

}

This is something that programmer should be able to do on any
language. I cannot do it in java, but i am new. Can anyone do it?
Lew - 26 Aug 2007 16:43 GMT
> So what is the way to compute the memory consumed by object? This is
> hard to work with language that doesn't support this. Can anyone post
[quoted text clipped - 11 lines]
> This is something that programmer should be able to do on any
> language. I cannot do it in java, but i am new. Can anyone do it?

AFAIK there is no general answer to "how large is an object" in Java, unless
one explicitly accounts for the time element.

"Memory consumed" makes most sense in a runtime context.  Others have alluded
to the difficulty, for example, of correlating the size of a serialized
representation to any runtime impact.  Let's grant that what we care about is
amount of memory consumed by an instance at runtime.

But runtime is an interval - a program is loaded, runs for a while then ends.
 The envelope of that varies according to the complexity of the program, its
usage patterns, whether it's a server process and so on.  During that
interval, the shape of a program varies wildly due to Java's dynamic nature.

Even individual objects of a class could be implemented differently at
different times during runtime.  For that matter, the same instance can change
its memory footprint during its lifetime.  Are you interested in the
instantaneous memory footprint, the mean memory consumption, its maximum?
Over a single instance's lifetime or aggregated for the lifetime of the class?

For that matter, the class itself might be garbage collected altogether during
the program's run.  If it's something used only during program initialization,
it might have an instantaneous footprint that is egregious but has no negative
impact on the program's performance during normal operation after it's been
collected.  Even during the init phase, hotspotting might inline the whole
thing and it would essentially disappear even while in use.

These factors make it difficult to give any kind of simple answer to your
question.

Signature

Lew

Lasse Reichstein Nielsen - 26 Aug 2007 17:01 GMT
> So what is the way to compute the memory consumed by object? This is
> hard to work with language that doesn't support this.

No it's not. I have yet to need it for anything in Java.

It all depends on how you think about memory. In C, you need to count
bytes and do pointer arithmetic. You need to know how many bytes you
use, because you are doing memory management manually.  In Java, you
don't care about the exact size of an object. You care about how many
objects you create and how long they live, but whether an object has 4
or 8 bytes of overhead is completely irrelevant.

> Can anyone post some code that work? Say you have this class what
> will is total memory for each instance:
[quoted text clipped - 5 lines]
>   public double x;
> }

As others have said, neither the Java Language specification or the
Java Virtual Machine specifications give requirements on how large
object implementations must be. Different JVM implementations can,
and probably do, differ.

> This is something that programmer should be able to do on any
> language. I cannot do it in java, but i am new. Can anyone do it?

Can you say what you need it for? Curiosity is fine, but any algorithm
that cares about the physical size of an object, i.e., deals with
objects on the byte level, is likely to be less portable than one that
deals with objects at the object level.

/L
Signature

Lasse Reichstein Nielsen  -  lrn@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
 'Faith without judgement merely degrades the spirit divine.'

Patricia Shanahan - 26 Aug 2007 17:29 GMT
>> So what is the way to compute the memory consumed by object? This is
>> hard to work with language that doesn't support this.
[quoted text clipped - 7 lines]
> objects you create and how long they live, but whether an object has 4
> or 8 bytes of overhead is completely irrelevant.

Huh? Here's a specific problem. Suppose I have an application that uses
a lot of memory. The size of a problem can be expressed in terms of a
few parameters. I know the numbers of certain types of objects that will
be created, as functions of those problem size parameters.

For simplicity, let's assume a single basic size parameter N. However,
for real problems there may be more size parameters.

I would like to run a problem with N=10,000. I know, by experiment, that
it does not run on any machine to which I currently have access. If I
ask my academic adviser (or my manager if I were working in industry)
for access to a bigger memory, the inevitable question is "How big?".

How should I go about answering that question, without caring about
object sizes?

Patricia
Lasse Reichstein Nielsen - 26 Aug 2007 18:21 GMT
>> In Java, you don't care about the exact size of an object. You care
>> about how many objects you create and how long they live, but
>> whether an object has 4 or 8 bytes of overhead is completely
>> irrelevant.

> Huh? Here's a specific problem. Suppose I have an application that uses
> a lot of memory. The size of a problem can be expressed in terms of a
[quoted text clipped - 8 lines]
> ask my academic adviser (or my manager if I were working in industry)
> for access to a bigger memory, the inevitable question is "How big?".

I admit I was overgeneralizing. However, I don't think it was by much :)

So the "How big?" question is a good one, but hard to answer.
If you know that on the current version of the VM, OS and hardware you
plan to use, the objects takes exactly 24 bytes, how much memory will
you need then? What if it was 32 bytes?  

If 99% of the heap will be occupied by the data objects, which
will all stay live during the entire computation, then the poor garbage
collector won't have much to do. If not, you need to know also how many
objects are alive at a time (this might not be linear in the problem
size). You should consider how a generational garbage collector interacts
with all these objects, and tweak VM parameters to match.

If you are cutting it so close that the overhead of the object
implementation counts, then I would go for a language with manual
memory management instead of Java. I wouldn't rely on automatic,
garbage collected, memory management for something with so much
simultaneous live data.
I'm certain not everybody agrees with such heresy :)

> How should I go about answering that question, without caring about
> object sizes?

Object sizes alone won't be enough. It's the path of bit-fiddling and
platform-specific tweaking to get there, and then Java has already
lost much of its advantage.

/L
Signature

Lasse Reichstein Nielsen  -  lrn@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
 'Faith without judgement merely degrades the spirit divine.'

Patricia Shanahan - 26 Aug 2007 18:27 GMT
>>> In Java, you don't care about the exact size of an object. You care
>>> about how many objects you create and how long they live, but
[quoted text clipped - 20 lines]
> plan to use, the objects takes exactly 24 bytes, how much memory will
> you need then? What if it was 32 bytes?  

The technique I would use to answer the question is to measure, on the
JVM I intend to use but on a system to which I have access, the in-use
memory (totalMemory() - freeMemory() immediately after a System.gc()
call) with controlled numbers of objects in existence. Given that
result, and the relationships between problem size and object creation,
I can project out the memory for larger problem sizes.

> If you are cutting it so close that the overhead of the object
> implementation counts, then I would go for a language with manual
> memory management instead of Java. I wouldn't rely on automatic,
> garbage collected, memory management for something with so much
> simultaneous live data.
> I'm certain not everybody agrees with such heresy :)

There is too much that is convenient about Java for me to throw it out
just because memory size estimation requires some effort. Note that I
have seen far more consistency than you seem to expect in things like
array and object overhead. Of course, switching to a 64 bit JVM does
make a difference.

Patricia
Arne Vajhøj - 26 Aug 2007 19:31 GMT
> The technique I would use to answer the question is to measure, on the
> JVM I intend to use but on a system to which I have access, the in-use
> memory (totalMemory() - freeMemory() immediately after a System.gc()
> call) with controlled numbers of objects in existence. Given that
> result, and the relationships between problem size and object creation,
> I can project out the memory for larger problem sizes.

And I actually posted a code snippet doing so yesterday.

Arne
Arne Vajhøj - 26 Aug 2007 19:30 GMT
> So what is the way to compute the memory consumed by object? This is
> hard to work with language that doesn't support this. Can anyone post
[quoted text clipped - 11 lines]
> This is something that programmer should be able to do on any
> language. I cannot do it in java, but i am new. Can anyone do it?

I posted a solution yesterday.

Arne
Twisted - 31 Aug 2007 18:21 GMT
On Aug 24, 4:50 pm, molesky...@yahoo.com wrote:
> One thread here said you can serialize object and count serialized
> bytes to get the size of the object
> This is incorrect. I serialize small class with one boolean and 2
> chars and
> got something crazy like 637 bytes for the size.

This has all kinds of problems.

First, objects may be larger in memory than serialized, due to
transient fields. As others noted, measuring size in memory is best
done by something like

Runtime rt = Runtime.getRuntime();
System.gc();
int usage = rt.totalMemory() - rt.freeMemory();
MyObject myObject = new MyObject(args);
System.gc();
int size = rt.totalMemory() - rt.freeMemory() - usage;

This still will vary from VM to VM, and may not work perfectly
(System.gc() doesn't guarantee the gc runs, so it can err high if
transient objects are made and discarded by the MyObject constructor
and the second System.gc() does nothing, and it can err low if the
first System.gc() does nothing and the second does and collects some
garbage).

Measuring the size of serialized objects can be done more reliably,
since for an identical object it will be identical on all platforms
given the same version of the object's class. It may vary from
instance to instance depending on what objects it references or
contains via its member variables though. Still it will give you an
idea of how much disk space or bandwidth serialized instances will
consume in bulk.

But serialized output contains overhead; this will be most of your 637
bytes. I'd serialize an N-element array of MyObjects and an N+1-
element array of MyObjects, both with every array cell containing a
MyObject (rather than null), and look at the difference in their file
sizes. I'd make the fields that would tend to reference shared objects
reference a single instance from all these MyObjects so that their
referents "don't count" in the final calculation, and the fields that
would tend to reference "owned" objects or "contained" ones reference
separate ones for each instance so that their referents do count. This
will give the best idea of scaling behavior when a large number of
MyObjects are serialized on a single stream. Your original figure of
637 bytes is, on the other hand, unfortunately exactly what you can
expect if each one is serialized on a separate stream.

Note that the suggested method of measurement should end up summing
the MyObject size as the size of its fields, with fields of reference
type being the size of a pointer or some equivalent, plus the size of
the objects an instance references with such fields and actually
"owns".
Roedy Green - 01 Sep 2007 13:02 GMT
>One thread here said you can serialize object and count serialized
>bytes to get the size of the object

see http://mindprod.com/jgloss/sizeof.html

Signature

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com



Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.