Accumulating memory in nested loops

performance

#1

Is it normal that memory usage accumulate when running loops (many iterations including nested loops) inside a function in julia? I’m using Julia-0.4.5. and I think there is garbage collection in julia that helps clean memory.

I wrote a toy example of my code structure for the problem I encountered when running large scale simulation.

include("fcts.jl")
function sim(seed)
  #dummy=0
  A=[12,5,6,8,15,9];
  B=[4,12,19,5];
  C=[10,3,9,17];
  for iA=1:length(A), iB=1:length(B),iC=1:length(C)
    a=A[iA]*10^2; b=B[iB]*10^2; c=C[iC]*10^2
    srand(seed)
    M=randn(a,b)
   #  tic(); Method1(); T11=toq()
   #  tic(); Method2(); T12=toq()
   #  tic(); Method3(); T13=toq()
    N=randn(a,c)
  ## measure corresponding time
    K=randn(b,c)
  ## measure corresponding time
  ## save results for each iteration in case the simulation stops b/c of error
  #  filePath = "..."
  #   fp = open(filePath*string(myid())*".csv", "a")
  #   writecsv(fp,hcat(T11,T12,T13,...,seed))
  #   close(fp)
  end
end

My original simulation runs with parallel pamp(sim,1:30). But as monitoring the memory usage with “top” command in terminal, I find the memory accumulate gradually and finally the job get killed when it goes beyond the limit.

Could anyone tell me if there is any obvious problem which causes the heavy memory usage?


#2

Please quote (using `) and indent your code so that it is readable.


#3

(I edited your post to fix the code quoting.)


#4

You are allocating new M, N, and K matrices in each loop iteration. The old ones will eventually get garbage collected, but the memory usage reported by top is notoriously deceptive in garbage-collected languages. Top reports multiple numbers; are you looking at the RSS size?


#6

I read both VIRT and RES usage on top report gradually accumulate. IT support told me that RES " resident memory" is more relevant.
Since the iterations are independent of each other, I thought the memory will be released after each iteration. Does it suppose to allocate new space each iteration? or There could be someway to get around?


#7

Memory will be released eventually, but not immediately after each iteration. But it should be released long before memory starts to fill up. (See https://en.wikipedia.org/wiki/Tracing_garbage_collection)

I tried your code on my machine and didn’t see any problem. map(sim, 1:30) runs in about 100sec, and the memory usage fluctuates around 200MB the whole time. Can you give an minimal working example that actually demonstrates your problem?


#8

Thanks for the reply!
I made an example of my simulation which captures most of its feature. For simplicity, I’ve removed many steps including calling compiled C code. It can be downloaded here: https://github.com/wenlongG/Example.git
At the same time, I’m also checking the C code about proper memory release.

I’m not an expert in writing Julia functions, if there is some improper usage, please let me know. Thank you!


#9

I called C code many times in the original simulation study which return big sparse matrices to Julia for each iteration. I just realized that if I free the space in C then no value get passed to Julia.
That could be the problem causing accumulating memory…But I don’t have better ways to deal with this.


#10

You have to make sure that Julia de-allocates this memory.

One option is to use the own=true keyword option to the unsafe_wrap function, to convert a C-allocated pointer to a Julia array where Julia calls free on the memory when it is done.

Another option is to allocate the array in Julia and pass the pointer to C, letting the C code fill in the data as needed.

(If you are allocating lots of matrices in a loop and then discarding them, it is usually preferable from a performance perspective anyway to allocate the memory once, and then mutate the contents in the loop rather than re-allocating.)


#11

I tried the pointer_to_array(ptr,dims,true) but that didn’t solve the problem of accumulating memory.

In the help docs about calling-c-and-fortran-code. It says “If the pointer of interest is a plain-data array (bitstype or immutable), the function pointer_to_array(ptr,dims,[own]) may be more useful. The final parameter should be true if Julia should “take ownership” of the underlying buffer and call free(ptr) when the returned Array object is finalized.”

However, calling free(ptr) afterwards gives ERROR: UndefVarError: free not defined. Does julia call this free(ptr) by itself, or an explicit gc() could do the work?

Thank you for your help!


#12

Julia calls free itself once the array is garbage-collected.

Alternatively, you can use pointer_to_array(ptr,dims,false) to retain ownership of the pointer, can call Libc.free(ptr) yourself once you are done with it.