Java Forum / First Aid / August 2006
Memory Allocation in Java
Christopher Smith - 26 Aug 2006 15:48 GMT Hi All -
Problem: I have a large array of floating point numbers I need to look for. These results come from a brut-force grid search, where the coordinates (x,y) are non-parametric test results.
The problem is that the length of x and y, and thus the size of the grid is quite large. The length is a minimum of 120,000 both directions on the grid, for a total of 14,400,000,000 possible combinations. Which obviously consumes a lot of memory -- somewhere on the order of 500 MB, if 32-bit floating point.
I'm trying to think through the best way to handle this. Currently, I'm setting up an array Array[][] results = new Array[120000][120000].
Questions When intitailizing the array, the ide usually hangs. Is the compiler trying to find contingous memory? Is there a better way to handle this the results? If I initialized a vector of length 14.4 million, would that be faster? Any ideas are greatly appreciated.
Thanks, chris
Andrew Thompson - 26 Aug 2006 16:03 GMT ...
> Problem: I have a large array of floating point numbers ...
> ..... Which obviously > consumes a lot of memory -- somewhere on the order of 500 MB, if 32-bit > floating point. Note that you can increase the memory available to a Java process. See the -Xms/-Xmx flags. <http://java.sun.com/j2se/1.5.0/docs/tooldocs/windows/java.html#Xms>
Andrew T.
Christopher Smith - 26 Aug 2006 18:11 GMT > ... >> Problem: I have a large array of floating point numbers [quoted text clipped - 8 lines] > > Andrew T. Andrew -
Thanks for your response. I did try that. I use the vm option: -Xmx768m. After looking at the link, I realized that the x option is the cap, not the initial size. Let me rerun and see if that helps.
Andrew Thompson - 27 Aug 2006 02:55 GMT > > ... > >> Problem: I have a large array of floating point numbers [quoted text clipped - 6 lines] > > a Java process. See the -Xms/-Xmx flags. > > <http://java.sun.com/j2se/1.5.0/docs/tooldocs/windows/java.html#Xms> (Follow-Up relevant to comments made on other messages in this thread)
AFAIU - the Xmx argument should do it.
To be honest, I couldn't find an 'anchor' to jump directly to the Xmx description (damn Sun HTML!), but since it was right beneath the #Xms anchor - so I decided to mention it at the same time.
OTOH - Eric made some interesting numbers fall out of your specs..
Andrew T.
Patricia Shanahan - 26 Aug 2006 16:09 GMT > Hi All - > [quoted text clipped - 19 lines] > > Thanks, chris If it were a contiguous memory issue, that would go away if you initialized each array separately - remembering you really have a 120,000 element array whose elements are references to 120,000 element arrays of float.
Do you have enough swap space in the machine?
Do you have problems if you run it outside the IDE?
Patricia
Christopher Smith - 26 Aug 2006 18:13 GMT >> Hi All - >> [quoted text clipped - 31 lines] > > Patricia Gotcha. Let me try it first with the right compiler flag (please see my response to Andrew).
If that doesn't help, I think what you're saying is that I should initialize the array along the X axis and then intiatialize an array within each "row" -- for lack of a better term. Am I getting this right?
Should have plent of mem. Over 1GB of ram and healthy swap space (windows xp prof is the current platform but the app will run regulary on redhat.)
Also, I'll try outside the IDE.
Patricia Shanahan - 26 Aug 2006 20:14 GMT ...
> Gotcha. Let me try it first with the right compiler flag (please see my > response to Andrew). > > If that doesn't help, I think what you're saying is that I should > initialize the array along the X axis and then intiatialize an array > within each "row" -- for lack of a better term. Am I getting this right? I don't know that this will do any good, but it seems worth a try. Note that it gives you the option of reporting progress. You could output a message including memory statistics every 1000 rows. Even if it does not work, that may give some more clues.
Patricia
Eric Sosman - 26 Aug 2006 20:52 GMT > Hi All - > [quoted text clipped - 7 lines] > consumes a lot of memory -- somewhere on the order of 500 MB, if 32-bit > floating point. ... for suitable values of "somewhere on the order of." 120000 * 120000 * 4 = 57600000000 ~= 55000 MB ~= 54 GB. Are you sure the dimensions you've given are correct?
If the dimensions are correct, I hope you have a 64-bit JVM and a pretty substantial machine to work with.
 Signature Eric Sosman esosman@acm-dot-org.invalid
Patricia Shanahan - 27 Aug 2006 03:10 GMT >> Hi All - >> [quoted text clipped - 14 lines] > If the dimensions are correct, I hope you have a 64-bit > JVM and a pretty substantial machine to work with. Good point. I didn't check the arithmetic on the memory size.
This raises a whole different set of issues. Maybe brute force is not the way to go.
How many of the elements of the matrix are non-zero? Maybe this is a case for sparse matrix techniques?
If dense, it might be better to keep it on disk. That raises a whole set of issues of whether it is possible to batch and sort updates to the matrix to reduce the amount of I/O.
Patricia
Christopher Smith - 27 Aug 2006 15:09 GMT >>> Hi All - >>> [quoted text clipped - 28 lines] > > Patricia Thanks for straightnening out my math.
I do have horsepower, but don't want to tie up that many resources.
No, sparse matrix math won't work. Every field has a value.
I guess divide and conquer is the right way to go. What I can do is splice the grid-search into quadrants, process each quadrant, report and record the quadrant results (i.e., I'm searching for the Max within the grid). From there, it's just a matter of rolling through the quadrants.
Thanks for your help.
Chris
Eric Sosman - 27 Aug 2006 16:07 GMT >>>>Hi All - >>>> [quoted text clipped - 34 lines] > > No, sparse matrix math won't work. Every field has a value. It's a startlingly large number of values; may I ask where they all came from? Just curious, really.
Amusing factoid: There are about 3.4 times as many fields as there are distinct `float' values.
> I guess divide and conquer is the right way to go. What I can do is > splice the grid-search into quadrants, process each quadrant, report and > record the quadrant results (i.e., I'm searching for the Max within the > grid). From there, it's just a matter of rolling through the quadrants. Could you explain the nature of this search a little more? Simply "searching for the Max" in a big collection of numbers requires very little memory; there's no need to retain a number that's known to be non-maximal.
 Signature Eric Sosman esosman@acm-dot-org.invalid
Patricia Shanahan - 27 Aug 2006 16:12 GMT >>>>> Hi All - >>>>> [quoted text clipped - 49 lines] > requires very little memory; there's no need to retain a number > that's known to be non-maximal. Indeed. If that is the typical operation, I would just stick the numbers in any convenient order in a disk file, and scan it sequentially, keeping the max so far and its location.
For many other operations, there are known out-of-core algorithms that are designed to do as much work as possible on a chunk while it is in memory. Don't assume that the right algorithm if the problem fits in memory is necessarily the right algorithm if it doesn't.
Patricia
Christopher Smith - 27 Aug 2006 17:57 GMT >>>>>Hi All - >>>>> [quoted text clipped - 37 lines] > It's a startlingly large number of values; may I ask where > they all came from? Just curious, really. Sure. The numbers represent the (mean value/stdev value) of a longtudinal time series, when tested against two conditions, x & y. In other words, it looks to find the best result on daily basis historically when a sample is tested against these two conditions (i.e., "in-sample"). Think of it as looking back in time, testing against two conditions with perfect knowledge, looking at the results for those two conditions, x & y, and find those two conditional tests that give the best results over the time series ex-post. I actually want to test against six conditions when I'm done, but am working which are the the two which explain return variance most significantly.
> Amusing factoid: There are about 3.4 times as many fields > as there are distinct `float' values. What is a "field" mean in this case?
>> I guess divide and conquer is the right way to go. What I can do is >> splice the grid-search into quadrants, process each quadrant, and hen [quoted text clipped - 6 lines] > requires very little memory; there's no need to retain a number > that's known to be non-maximal. I run through these test conditions over a time series, starting with the lowest test parameter for x and for y. I then iterate over the time series, looking at each time-record-entry for a set of conditions, including x&y, if those conditions are true, and the values are greater than x or y, I then record the entry. At the conclusion of the time series, I then iterate the result to test again, with x += x + x_step and y += y + y_step. After retesting 120k^2 times, I then look across the grid to determine which x and y test provide the maximum value. This concludes the in- sample testing. I then test out-of-sample to check projection error, but that's a whole other discussion! HTH.
I guess what you're saying is that I can discard all previous numbers each time I find a new maximum? If so, what would be the computational tradeoff between storing the results and testing for the max result, as opposed to testing for the max result as I go?
Chris
Patricia Shanahan - 27 Aug 2006 18:47 GMT ...
> I guess what you're saying is that I can discard all previous numbers each > time I find a new maximum? If so, what would be the computational tradeoff > between storing the results and testing for the max result, as opposed to > testing for the max result as I go? ...
In your situation, computation is the very least of your worries. Getting the data in and, if necessary, out in an efficient fashion will make far more difference to performance. Count the disk I/O. Look at whether the I/O is sequential or random.
If saving some data produces a net reduction in data transfer, do it. If recalculating does less I/O, do that.
Here are some tricks that may help - I don't understand your application well enough to say which are best:
1. Make sure the input data is dense. If the file contains data that you are not using, build an extracted file that contains only the data that matters.
2. Consider working on rectangular chunks of matrix, finding the best value for each chunk.
3. Consider keeping the input file in memory in a compact form, if it is small enough.
However, the big message is to evaluate each strategy in terms of its effect on data transfer.
Patricia
Eric Sosman - 27 Aug 2006 19:40 GMT > Eric Sosman <esosman@acm-dot-org.invalid> wrote: >>>[...] [quoted text clipped - 13 lines] > am working which are the the two which explain return variance most > significantly. I'm still having trouble following your description. You started out with 120000 by 120000 32-bit floating-point values, referred to the assemblage as a "grid," and spoke of the data as being "non-parametric." Okay, thus far I'm imagining a big blob of numbers representing fourteen billion measurements on a two-dimensional surface (not necessarily spatial, but 2D). But now you're speaking of "mean value/stdev value," which implies either that the measurements are grouped into sets by some rule that hasn't been specified, or that there are in fact two sets of fourteen billion numbers (fourteen billion means, fourteen billion variances) that summarize a still larger set of measurements. And as if that weren't enough, you've thrown a time axis into the mix; the data set is now to be understood as three-dimensional? Or perhaps more?
>> Amusing factoid: There are about 3.4 times as many fields >>as there are distinct `float' values. > > What is a "field" mean in this case? It was your term: "Every field has a value." I assumed you meant it to refer to one of the foutreen billion points of the 2D grid, but I'm no longer sure what you mean.
>>>I guess divide and conquer is the right way to go. What I can do is >>>splice the grid-search into quadrants, process each quadrant, and hen >>>and record the quadrant results (i.e., I'm searching for the Max >>>within the grid). From there, it's just a matter of rolling through >>>the quadrants. Note that each quadrant -- if by "quadrant" you mean "one fourth" -- is still about 13.5 GB of data. (Assuming "only" fourteen billion 4-byte numbers.)
>> Could you explain the nature of this search a little more? >>Simply "searching for the Max" in a big collection of numbers [quoted text clipped - 11 lines] > sample testing. I then test out-of-sample to check projection error, but > that's a whole other discussion! HTH. You examine "over a time series," so that's one dimension of the search. But your search over x and y is apparently one-D also, because you step both coordinates together to trace a "diagonal" across the search area. And then you "re"-test fourteen billion times; what do you mean by "re"-test? If this is the repetition, what was the original (and why didn't it suffice)?
> I guess what you're saying is that I can discard all previous numbers each > time I find a new maximum? If so, what would be the computational tradeoff > between storing the results and testing for the max result, as opposed to > testing for the max result as I go? I'm withdrawing all advice on the grounds that the more you explain what you're doing, the less I understand. Good luck!
 Signature Eric Sosman esosman@acm-dot-org.invalid
Christopher Smith - 27 Aug 2006 23:24 GMT >> Eric Sosman <esosman@acm-dot-org.invalid> wrote: >>>>[...] [quoted text clipped - 29 lines] > the data set is now to be understood as three-dimensional? Or > perhaps more? I guess I misunderstood your question. I thought you wanted to know how those numbers in the 120k x 120k grid were calculated. What I described was the process to get those results.
>>> Amusing factoid: There are about 3.4 times as many fields >>>as there are distinct `float' values. [quoted text clipped - 14 lines] > fourth" -- is still about 13.5 GB of data. (Assuming "only" > fourteen billion 4-byte numbers.) Right. I would need to break the quadrants up into quite a few pieces.
>>> Could you explain the nature of this search a little more? >>>Simply "searching for the Max" in a big collection of numbers [quoted text clipped - 19 lines] > times; what do you mean by "re"-test? If this is the repetition, > what was the original (and why didn't it suffice)? Let me see if a picture can help:
this is for one iteration. X ranges from -5 to 5 and Y ranges from -10 to 10. Time: 1 2 3 4 5 6 7 Sample: 10 12 -10 -2 5 2 6
Pos(x test) T T T T T else(y test) T F
Cond X (samp > 6): T T T F T Cond Y (samp > -4): F F --------------------------------------------------------------- Time Series Result: 10 12 -2 5 6
The time series mean would be (10,12,-2,5,6)/5 = 6.2 and the standard deviation would be 4.83. The ratio would be 6.2/4.83, or 1.28.
This is what is saved in the grid, in this case at coordinates 6,-4.
Does this make sense? I've left out quite a few details to make it simpler.
>> I guess what you're saying is that I can discard all previous numbers >> each time I find a new maximum? If so, what would be the [quoted text clipped - 3 lines] > I'm withdrawing all advice on the grounds that the more you > explain what you're doing, the less I understand. Good luck! Quick followup question about mathematics in java. What's faster:
x += 12 or x = x + 12. Just wondered if there was any difference?
Thanks, Chris
Oliver Wong - 28 Aug 2006 21:11 GMT > Let me see if a picture can help: > [quoted text clipped - 11 lines] > --------------------------------------------------------------- > Time Series Result: 10 12 -2 5 6 You should probably avoid tabs for fixed-width ASCII tables.
> The time series mean would be (10,12,-2,5,6)/5 = 6.2 > and the standard deviation would be 4.83. [quoted text clipped - 4 lines] > Does this make sense? I've left out quite a few details to make it > simpler. It's not clear what "This" refers to in "This is what is saved in the grid".
>>> I guess what you're saying is that I can discard all previous numbers >>> each time I find a new maximum? If so, what would be the [quoted text clipped - 7 lines] > > x += 12 or x = x + 12. Just wondered if there was any difference? A good JVM should treat the two equivalently. If you're looking for optimizations, it's not at the right level.
- Oliver
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|