Memory issues

Hi! Got a question on memory handling:

For the following script

nlon = 360; nlat = 181; nhr = 744; np = 37;

for loop = 1 : 5

    @info "$( - Preallocating arrays ..."
    Ta = zeros(nlon,nlat,nhr,np); sH = zeros(nlon,nlat,nhr,np);
    za = zeros(nlon,nlat,nhr,np); Tm = zeros(nlon,nlat,nhr);

    @info "$( - Extracting Surface-Level data for $(dtii) ..."
    Ts = rand(nlon,nlat,nhr); Td = rand(nlon,nlat,nhr)

    @info "$( - Extracting Pressure-Level data for $(dtii) ..."
    for pii = 1 : np; pre = p[pii];
        Ta[:,:,:,pii] = rand(nlon,nlat,nhr)
        sH[:,:,:,pii] = rand(nlon,nlat,nhr)
        za[:,:,:,pii] = rand(nlon,nlat,nhr)


After a loop or two, my job gets killed on a cluster because the amount of memory required is too large. However, I requested 80GB of memory, so honestly I am not sure what the issue is here. Is there something I’m missing in memory handling in for-loops?

For reference 360*181*37*24*37*3 = 6.5*10^9 which in Float64 is around 51 GB, so I’m not too sure what the issue is here because it definitely is enough memory for at least 1 loop (which is true, it does work fine for the first loop or two). But I’m just not sure if the memory carries over into next iterations. Why would this be the case?

Edit: Below is the output:

[nwong@holy7c26401 scripts]$ julia test2.jl 
[ Info: 2020-04-14T18:21:27.308 - Preallocating arrays ...
[ Info: 2020-04-14T18:21:40.976 - Extracting Surface-Level data for 1 ...
[ Info: 2020-04-14T18:21:41.569 - Extracting Pressure-Level data for 1 ...
[ Info: 2020-04-14T18:22:03.337 - Preallocating arrays ...

Why are you zeroing out the large arrays at the top of each loop without doing anything with them? You assign to Ta, sH, and za in the inner loop but you don’t actually use them as far as I can see.

Other than that, what happens if you deallocate the arrays at the bottom of the loop so that the garbage collector can deal with them more quickly? (At the risk of introducing type instability, perhaps a Ta = sH = za = Tm = [] at the bottom of the outer loop will do it.)

1 Like

It’s not put into the code because it’s not an important part of the code, but the results from Ta, sH, etc. are all part of a calculation that is later saved into a file with each loop, and not included here because the problem is the memory.

i have tried deallocating the arrays by doing Ta = [] individually and doing GC.gc() but no luck as well.

You need a bit more, garbage collection is not immediate so new and old arrays can coexist.

Generally, I would try making operations in-place, eg

  1. generate Ta & friends outside the loop, and zero them out with .= inside,
  2. use rand! on the whole array or a view.