Java Forum / General / December 2006
OutOfMemoryException
srinivas.veeranki@gmail.com - 20 Dec 2006 11:49 GMT Hi All,
While I am retrieving from the data base I am getting more than 45k Records. I am unable to stored all those records in to the List. Here I am getting OutOfMemoryException. How Can i store all those records in the List.
Regards,
Srinivas.
bugbear - 20 Dec 2006 12:27 GMT > Hi All, > > While I am retrieving from the data base I am getting more than 45k > Records. I am unable to stored all those records in to the List. Here I > am getting OutOfMemoryException. How Can i store all those records in > the List. Use less memory per record. Which may involve simple data packing or compression.
BugBear
Oliver Wong - 20 Dec 2006 15:31 GMT >> Hi All, >> [quoted text clipped - 5 lines] > Use less memory per record. Which may involve > simple data packing or compression. Alternatively, assign more memory to your Java program, or don't actually store all the data in the list. See the Proxy design pattern, for an example of the latter.
- Oliver
bugbear - 20 Dec 2006 17:32 GMT >>Use less memory per record. Which may involve >>simple data packing or compression. > > Alternatively, assign more memory to your Java program, or don't > actually store all the data in the list. See the Proxy design pattern, for > an example of the latter. At the risk of topic drift, does any one know of a List implementation which doesn't suffer from ArrayList's "growth problem", causing (IMHO) premature out of memory reports?
The problem is that when the current memory allocation is exceeded, an ArrayList tries to double in size;
assuming you're currently using (e.g.) 64 Mb, it will try and allocate 128 Mb, then copy the data from the 64 to the 128, and only then does the 64 become available for GC.
Assuming you actual data size were 65 Mb, this means you would require 196 Mb to "survive" the handover.
A better data structure (w.r.t growth) would be a list of BLOCKs of some size. Additional blocks can be added, and serial (iterator) access would be fast.
Random access is also "quite" fast, if the blocks are a "reasonable" size.
Does anyone know an implementation of this, or something like it?
BugBear
Daniel Pitts - 20 Dec 2006 19:53 GMT > >>Use less memory per record. Which may involve > >>simple data packing or compression. [quoted text clipped - 30 lines] > > BugBear If you know before hand that you'll have a lot of data, you can tell the ArrayList to allocate a capacity that is as large (or larger) than you need.
EJP - 21 Dec 2006 00:39 GMT > At the risk of topic drift, does any one > know of a List implementation > which doesn't suffer from ArrayList's > "growth problem", causing (IMHO) premature > out of memory reports? java.util.LinkedList.
> The problem is that when the current memory allocation > is exceeded, an ArrayList tries to double in size; No. 'The details of the growth policy are not specified beyond the fact that adding an element has constant amortized time cost.' The JDK 1.5 implementation increases the size by 50%.
Bent C Dalager - 21 Dec 2006 09:52 GMT >> At the risk of topic drift, does any one >> know of a List implementation [quoted text clipped - 3 lines] > >java.util.LinkedList. Of course, if, as in the proposed scenario, you have 65MB of _references_ alone (which seems a lot), then a singly linked list would increase the resident requirement to 130MB and a doubly linked one to 196MB. This may or may not be considered an improvement over ArrayList's performance.
I don't know the details of how java.util.LinkedList is implemented though - it might use more than this, or it might use less (well, 130MB seems like a minimum any way you do it).
It really needs to be said that with 65MB of references, you are probably more worried about the space needed by the actual objects being referred to - unless there are multiple repetitions of the same reference within those 65MB.
>No. 'The details of the growth policy are not specified beyond the fact >that adding an element has constant amortized time cost.' The JDK 1.5 >implementation increases the size by 50%. Wouldn't this be a useful thing to have a protected method decide so that the policy can be modified by subclasses?
Cheers Bent D
 Signature Bent Dalager - bcd@pvv.org - http://www.pvv.org/~bcd powered by emacs
bugbear - 21 Dec 2006 11:11 GMT >> At the risk of topic drift, does any one >> know of a List implementation [quoted text clipped - 10 lines] > that adding an element has constant amortized time cost.' The JDK 1.5 > implementation increases the size by 50%. Actually, even if it only increment by 5% the instantaneous memory use is still double (old copy, slightly larger new copy) the actual requirment.
BugBear
Thomas Hawtin - 21 Dec 2006 15:13 GMT >> At the risk of topic drift, does any one >> know of a List implementation [quoted text clipped - 3 lines] > > java.util.LinkedList. Technically true, but the LinkedList will already be using vastly more memory than ArrayList by the time there is a problem. LinkedList is a good answer to very few questions.
Tom Hawtin
John Ersatznom - 21 Dec 2006 15:34 GMT > Technically true, but the LinkedList will already be using vastly more > memory than ArrayList by the time there is a problem. LinkedList is a > good answer to very few questions. One of those "very few" questions happens to arise frequently, however, namely "What do I do when I add only at one or both ends and remove only at the ends or when iterating"? LinkedList can do all of those in O(1), assuming a non-stupid implementation of the iterator. It's the perfect back end for various kinds of queues and temporary lists. That other thread where someone wants to read in variable-length arrays as arrays, for example -- various people suggested ArrayList, but the actual maximum efficiency is achieved by building LinkedLists and calling toArray at the end of building each one. (ArrayLists would wind up with some extra capacity, unless you did two passes over the file instead of one, which would be an inefficiency of its own.)
Mike Schilling - 21 Dec 2006 01:36 GMT >>>Use less memory per record. Which may involve >>>simple data packing or compression. [quoted text clipped - 19 lines] > Assuming you actual data size were 65 Mb, this > means you would require 196 Mb to "survive" the handover. No, the only thing that increases (doubling or otherwise) is the array of references it uses; the data those references point to isn't duplicated. If you have 64K objects and the reference array doubles in sixe, you'd add 128K references, which, depending on the size of a reference, would probably total between 512KB and 1MB.
Mark Rafn - 21 Dec 2006 19:15 GMT >At the risk of topic drift, does any one >know of a List implementation >which doesn't suffer from ArrayList's >"growth problem", causing (IMHO) premature >out of memory reports? When possible, please change the title when you cause or notice a topic drift. It makes things far easier to find later.
>The problem is that when the current memory allocation >is exceeded, an ArrayList tries to double in size; Triple, actually. It creates a new array twice the size of the old array, and copies the contents from the old array before the old array can be GC'd.
>A better data structure (w.r.t growth) would be a list >of BLOCKs of some size. Additional blocks can be added, >and serial (iterator) access would be fast. Yup. You then lose the constant-time indexed access, though. get(int) now becomes linear based on the number of blocks.
You could also use a tree (or a tree of arrays) structure to balance space, sequential access, and indexed access patterns.
>Does anyone know an implementation of this, or something like it? It's easy to implement your own Collection or List. For a fairly simple structure, I'd rather write my own than to try to find a match, understand it well enough to know it's right for me, and deal with an external dependency. This is one of those fine lines where reusing code isn't worth the complexity of finding and incorporating it.
That said, jakarta commons has a collections package that provides some useful implementations, and more importantly some useful abstract base classes that may be easier to extend than writing from scratch.
And THAT said, the previous responses which advised to change the system not to NEED such an operation are better than making a list which can do it. -- Mark Rafn dagon@dagon.net <http://www.dagon.net/>
Mike Schilling - 21 Dec 2006 19:38 GMT > That said, jakarta commons has a collections package that provides > some useful implementations, and more importantly some useful > abstract base classes that may be easier to extend than writing from > scratch. As does the JDK; AbstractList and AbstractSequentialList are damned useful things.
Oliver Wong - 21 Dec 2006 19:57 GMT >>A better data structure (w.r.t growth) would be a list >>of BLOCKs of some size. Additional blocks can be added, [quoted text clipped - 3 lines] > now > becomes linear based on the number of blocks. Unless you can randomly access the blocks too, in which case the index access is O(2) (i.e. one operation to find the relevant block, and another operation to find the relevant item within that block) which is O(1). When expanding the list, you'd have to allocate space for more blocks, and copy the references to the block, which is a bit faster than copying the references to each element within the block. It's basically the same asymptotic runtime performance, but you're just playing around with the constants.
If you allow blocks of blocks, then you basically have a tree, and so index access becomes O(log(n)).
- Oliver
bugbear - 22 Dec 2006 10:20 GMT >>>A better data structure (w.r.t growth) would be a list >>>of BLOCKs of some size. Additional blocks can be added, [quoted text clipped - 15 lines] > If you allow blocks of blocks, then you basically have a tree, and so > index access becomes O(log(n)). Hmm. What about (designing whilst posting) an array of between 10 and 20 blocks references?
If the blocks start at (say...) 50 references each, the structure initially grows by adding new blocks, until we have 20 blocks.
When this point is reached, I suggest a new operation takes place; a merge into blocks of double the size.
After this merge the 20 blocks are now 10. Note that this merge takes a "very moderate" amount of extra memory.
Growth now proceeds as before, by adding new blocks.
This avoids the need for the LARGE amount of temporary memory use when growing an ArrayList, and still has (as far as I can work out) the same O() performance as ArrayList.
BugBear
John Ersatznom - 22 Dec 2006 10:12 GMT > Yup. You then lose the constant-time indexed access, though. get(int) now > becomes linear based on the number of blocks. That can be improved to logarithmic by recursively nesting the blocks. In the extreme case, the list is implemented as a binary tree under the hood, with the bit sequence of the index giving a series of left/right turns to make from the root down to the correct object. The number of links to follow is O(log n) in this case. When the list reaches 2^foo size, a new root node is created with the old one as left child and new list items develop the right side of the tree. When it's doubled again, this process happens again, and so forth. Of course, the index bit sequence used to navigate to an item has to be read from left to right starting with the nth bit and ending with the zeroth, with n the tree's height minus 1. The height is easy to track, though, since you can just stuff a zero in it for an empty list and add one every time a new root node is created. Memory can be saved by leaving unused parts of the tree absent, with a null child reference. An empty list has a null root node, a height of zero, and a grow threshold of one. Go to add an item and the threshold is reached, so a new root node is made, the old (null) added to it as its left child, and the object stored at index 0. The height's now 1, so the height minus 1 is 0 and no navigation is done and the object's stored at the root. The theshold is doubled to 2. Add another item and the threshold is reached again, so we end up with a root node with a left child node holding the object. The index 0 is now read to 1 bit to find that object, and that bit is a 0, so you make one left from the root. The new object's index of 1 has the bit a 1, so it's stored by creating a right child of the root node and storing the object there.
Turning the index into a path is cheap and O(log n) as well: you can just take a copy of the grow threshold, halve it, and & the index with this to get the first left/right choice; halve it and & the index for the second; and so forth. For the two-element list above, the threshold is 2 (meaning it's full) so we'd halve that (1) and look at index & 1, i.e. the zeroth bit. For a three- or four-element list the threshold has become 4, so we'd check index & 2 and then index & 1 in turn.
> You could also use a tree (or a tree of arrays) structure to balance space, > sequential access, and indexed access patterns. See above. You can get O(log n) indexed access out of that too. Sequential access is had by either the usual traversal algorithm or putting the linked list prev and next refs into the leaf node objects. There's probably also a way to halve the size of these trees by storing objects in all the nodes rather than only the leaf nodes too, but I can't be arsed to reinvent it from scratch right now. :)
> That said, jakarta commons has a collections package that provides some useful > implementations, and more importantly some useful abstract base classes that > may be easier to extend than writing from scratch. Linkie?
> And THAT said, the previous responses which advised to change the system not > to NEED such an operation are better than making a list which can do it. That depends on the system's ultimate requirements. It may or may not be possible, separately for each system where this question arises.
Simon Brooke - 20 Dec 2006 19:39 GMT > Hi All, > > While I am retrieving from the data base I am getting more than 45k > Records. I am unable to stored all those records in to the List. Here I > am getting OutOfMemoryException. How Can i store all those records in > the List. Records belong in the database. Retrieve them a few at a time. As a general rule, trying to schlurp the whole result of a database query into memory is a bad idea, because the data can be arbitrarily big.
 Signature simon@jasmine.org.uk (Simon Brooke) http://www.jasmine.org.uk/~simon/
;; Life would be much easier if I had the source code.
Mark Jeffcoat - 21 Dec 2006 00:30 GMT >> Hi All, >> [quoted text clipped - 6 lines] > rule, trying to schlurp the whole result of a database query into memory > is a bad idea, because the data can be arbitrarily big. More concretely:
If you're getting a java.sql.ResultSet from your database, you're already retrieving them a few at a time. (I have few nice things to say about this interface, but it is handy that you get this for free.) You can step through the results by repeatedly calling next(), and never have to worry about memory, as long as you can make your computation without keeping a reference to all those rows.
(You could even wrap up this behavior into your own List subclass, which would provide an Iterator that would call ResultSet.next(). I think this would be a terrible idea.)
 Signature Mark Jeffcoat Austin, TX
Daniel Pitts - 20 Dec 2006 19:58 GMT > Hi All, > [quoted text clipped - 6 lines] > > Srinivas. If possilbe, try to retrieve only a subset (using LIMIT in MySQL)
If you go through the records, and test them for some condition, and THEN work with them, do that test in the SQL statement, not Java.
Depending on how you are accessing the database, there may be ways to "paginate" your result list, so that only a portion is loaded in memory at any one time.
srinivas.veeranki@gmail.com - 26 Dec 2006 10:27 GMT Hi All,
I am retrieving the values fromt he database using the SpringFramework. In SpringFramework we have an execute method which returns the list and the query whcih we pass to the execute method returns 45k records
The code where i am getting out of memory is pasted below
String slSelectAdministradores = " SELECT codigo_administrador codigo," + " razon_social descripcion," + " clase_documento,identificacion," + " estado_administrador estado" + " FROM administradores noholdlock" + " WHERE razon_social >= ? and" + " razon_social <= ? "; SelectAdministradore objSelectAdministradore = new SelectAdministradore(dscDataSource, slSelectAdministradores); objSelectAdministradore.declareParameter(new SqlParameter(Types.VARCHAR)); objSelectAdministradore.declareParameter(new SqlParameter(Types.VARCHAR)); Object[] oalAdministradore = new Object[] { nombreInicial, nombreFinal }; objSelectAdministradore.compile();
List li1 = objSelectAdministradore.execute(oalAdministradore);
/** * Inner Class * */ class SelectAdministradore extends MappingSqlQuery {
CodDescripcionObj obj;
SelectAdministradore(DataSource dsaDataSource, String saSelectAdministradores) { super(dsaDataSource, saSelectAdministradores); }
protected Object mapRow(ResultSet rs, int pRowNum) throws SQLException {
obj = new CodDescripcionObj(); obj.setCodigo(rs.getString("codigo")); obj.setDescripcion(rs.getString("descripcion")); obj.setClase_documento(rs.getString("clase_documento")); obj.setIdentificacion(rs.getString("identificacion")); obj.setEstado(rs.getInt("estado")); return obj; } }
> Hi All, > [quoted text clipped - 6 lines] > > Srinivas.
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|