For Programmers: Free Programming Magazines  


Home > Archive > Java Help > August 2006 > Memory Allocation in Java









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Memory Allocation in Java
Christopher Smith

2006-08-26, 7:01 pm

Hi All -

Problem: I have a large array of floating point numbers I need to look
for. These results come from a brut-force grid search, where the
coordinates (x,y) are non-parametric test results.

The problem is that the length of x and y, and thus the size of the grid is
quite large. The length is a minimum of 120,000 both directions on the
grid, for a total of 14,400,000,000 possible combinations. Which obviously
consumes a lot of memory -- somewhere on the order of 500 MB, if 32-bit
floating point.

I'm trying to think through the best way to handle this. Currently, I'm
setting up an array Array[][] results = new Array[120000][120000].


Questions
When intitailizing the array, the ide usually hangs. Is the compiler
trying to find contingous memory?
Is there a better way to handle this the results? If I initialized a
vector of length 14.4 million, would that be faster? Any ideas are greatly
appreciated.

Thanks, chris
Andrew Thompson

2006-08-26, 7:01 pm

Christopher Smith wrote:
....
> Problem: I have a large array of floating point numbers

....
> ..... Which obviously
> consumes a lot of memory -- somewhere on the order of 500 MB, if 32-bit
> floating point.


Note that you can increase the memory available to
a Java process. See the -Xms/-Xmx flags.
<http://java.sun.com/j2se/1.5.0/docs...s/java.html#Xms>

Andrew T.

Patricia Shanahan

2006-08-26, 7:01 pm

Christopher Smith wrote:
> Hi All -
>
> Problem: I have a large array of floating point numbers I need to look
> for. These results come from a brut-force grid search, where the
> coordinates (x,y) are non-parametric test results.
>
> The problem is that the length of x and y, and thus the size of the grid is
> quite large. The length is a minimum of 120,000 both directions on the
> grid, for a total of 14,400,000,000 possible combinations. Which obviously
> consumes a lot of memory -- somewhere on the order of 500 MB, if 32-bit
> floating point.
>
> I'm trying to think through the best way to handle this. Currently, I'm
> setting up an array Array[][] results = new Array[120000][120000].
>
>
> Questions
> When intitailizing the array, the ide usually hangs. Is the compiler
> trying to find contingous memory?
> Is there a better way to handle this the results? If I initialized a
> vector of length 14.4 million, would that be faster? Any ideas are greatly
> appreciated.
>
> Thanks, chris


If it were a contiguous memory issue, that would go away if you
initialized each array separately - remembering you really have a
120,000 element array whose elements are references to 120,000 element
arrays of float.

Do you have enough swap space in the machine?

Do you have problems if you run it outside the IDE?

Patricia
Christopher Smith

2006-08-26, 7:01 pm

"Andrew Thompson" <andrewthommo@gmail.com> wrote in
news:1156604613.048633.100510@74g2000cwt.googlegroups.com:

> Christopher Smith wrote:
> ...
> ...
>
> Note that you can increase the memory available to
> a Java process. See the -Xms/-Xmx flags.
> <http://java.sun.com/j2se/1.5.0/docs...s/java.html#Xms>
>
> Andrew T.
>


Andrew -

Thanks for your response. I did try that. I use the vm option: -Xmx768m.
After looking at the link, I realized that the x option is the cap, not
the initial size. Let me rerun and see if that helps.

Christopher Smith

2006-08-26, 7:01 pm

Patricia Shanahan <pats@acm.org> wrote in
news:5vZHg.1553$xQ1.129@newsread3.news.pas.earthlink.net:

> Christopher Smith wrote:
>
> If it were a contiguous memory issue, that would go away if you
> initialized each array separately - remembering you really have a
> 120,000 element array whose elements are references to 120,000 element
> arrays of float.
>
> Do you have enough swap space in the machine?
>
> Do you have problems if you run it outside the IDE?
>
> Patricia


Gotcha. Let me try it first with the right compiler flag (please see my
response to Andrew).

If that doesn't help, I think what you're saying is that I should
initialize the array along the X axis and then intiatialize an array
within each "row" -- for lack of a better term. Am I getting this right?

Should have plent of mem. Over 1GB of ram and healthy swap space
(windows xp prof is the current platform but the app will run regulary on
redhat.)

Also, I'll try outside the IDE.
Patricia Shanahan

2006-08-26, 7:01 pm

Christopher Smith wrote:
....
> Gotcha. Let me try it first with the right compiler flag (please see my
> response to Andrew).
>
> If that doesn't help, I think what you're saying is that I should
> initialize the array along the X axis and then intiatialize an array
> within each "row" -- for lack of a better term. Am I getting this right?


I don't know that this will do any good, but it seems worth a try. Note
that it gives you the option of reporting progress. You could output a
message including memory statistics every 1000 rows. Even if it does not
work, that may give some more clues.

Patricia



Eric Sosman

2006-08-26, 7:01 pm

Christopher Smith wrote:

> Hi All -
>
> Problem: I have a large array of floating point numbers I need to look
> for. These results come from a brut-force grid search, where the
> coordinates (x,y) are non-parametric test results.
>
> The problem is that the length of x and y, and thus the size of the grid is
> quite large. The length is a minimum of 120,000 both directions on the
> grid, for a total of 14,400,000,000 possible combinations. Which obviously
> consumes a lot of memory -- somewhere on the order of 500 MB, if 32-bit
> floating point.


... for suitable values of "somewhere on the order of."
120000 * 120000 * 4 = 57600000000 ~= 55000 MB ~= 54 GB. Are
you sure the dimensions you've given are correct?

If the dimensions are correct, I hope you have a 64-bit
JVM and a pretty substantial machine to work with.

--
Eric Sosman
esosman@acm-dot-org.invalid
Andrew Thompson

2006-08-26, 10:02 pm

Christopher Smith wrote:[color=darkred]
> "Andrew Thompson" <andrewthommo@gmail.com> wrote in
> news:1156604613.048633.100510@74g2000cwt.googlegroups.com:
>

(Follow-Up relevant to comments made on other
messages in this thread)

AFAIU - the Xmx argument should do it.

To be honest, I couldn't find an 'anchor' to jump directly to
the Xmx description (damn Sun HTML!), but since it was
right beneath the #Xms anchor - so I decided to mention
it at the same time.

OTOH - Eric made some interesting numbers fall out
of your specs..

Andrew T.

Patricia Shanahan

2006-08-26, 10:02 pm

Eric Sosman wrote:
> Christopher Smith wrote:
>
>
> ... for suitable values of "somewhere on the order of."
> 120000 * 120000 * 4 = 57600000000 ~= 55000 MB ~= 54 GB. Are
> you sure the dimensions you've given are correct?
>
> If the dimensions are correct, I hope you have a 64-bit
> JVM and a pretty substantial machine to work with.
>


Good point. I didn't check the arithmetic on the memory size.

This raises a whole different set of issues. Maybe brute force is not
the way to go.

How many of the elements of the matrix are non-zero? Maybe this is a
case for sparse matrix techniques?

If dense, it might be better to keep it on disk. That raises a whole set
of issues of whether it is possible to batch and sort updates to the
matrix to reduce the amount of I/O.

Patricia
Christopher Smith

2006-08-27, 7:02 pm

Patricia Shanahan <pats@acm.org> wrote in
news:Ha7Ig.1997$bM.233@newsread4.news.pas.earthlink.net:

> Eric Sosman wrote:
>
> Good point. I didn't check the arithmetic on the memory size.
>
> This raises a whole different set of issues. Maybe brute force is not
> the way to go.
>
> How many of the elements of the matrix are non-zero? Maybe this is a
> case for sparse matrix techniques?
>
> If dense, it might be better to keep it on disk. That raises a whole
> set of issues of whether it is possible to batch and sort updates to
> the matrix to reduce the amount of I/O.
>
> Patricia


Thanks for straightnening out my math.

I do have horsepower, but don't want to tie up that many resources.

No, sparse matrix math won't work. Every field has a value.

I guess divide and conquer is the right way to go. What I can do is
splice the grid-search into quadrants, process each quadrant, report and
record the quadrant results (i.e., I'm searching for the Max within the
grid). From there, it's just a matter of rolling through the quadrants.

Thanks for your help.

Chris
Oliver Wong

2006-08-28, 7:03 pm


"Christopher Smith" <csmith@mclellanfundsREMOVE.com> wrote in message
news:Xns982CBB4783873csmithmclellanfunds
c@216.196.97.142...
>
> Let me see if a picture can help:
>
> this is for one iteration. X ranges from -5 to 5 and Y ranges from -10 to
> 10.
>
> Time: 1 2 3 4 5 6 7
> Sample: 10 12 -10 -2 5 2 6
>
> Pos(x test) T T T T T
> else(y test) T F
>
>
> Cond X (samp > 6): T T T F T
> Cond Y (samp > -4): F F
> ---------------------------------------------------------------
> Time Series Result: 10 12 -2 5 6


You should probably avoid tabs for fixed-width ASCII tables.

>
> The time series mean would be (10,12,-2,5,6)/5 = 6.2
> and the standard deviation would be 4.83.
> The ratio would be 6.2/4.83, or 1.28.
>
> This is what is saved in the grid, in this case at coordinates 6,-4.
>
> Does this make sense? I've left out quite a few details to make it
> simpler.


It's not clear what "This" refers to in "This is what is saved in the
grid".

>
>
> Quick followup question about mathematics in java. What's faster:
>
> x += 12 or x = x + 12. Just wondered if there was any difference?


A good JVM should treat the two equivalently. If you're looking for
optimizations, it's not at the right level.

- Oliver

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com