Dear Community,
maybe this is a very basic question. I need to run several computational experiments on a cluster, on which I need to reserve the amount of RAM used in advance. If a job exceeds that limit it is killed without pre-warning. Hence, my question is if it is possible to track how much RAM is used at each moment by my program? This would allow me to ‘controlled’ terminate the program, i.e., write a result file with all results obtained so far, and a bit indicating the memory overflow.
Thank you in advance.
I would also like to know answer to this. Mathematica has MemoryConstrained
, which is the same idea.
I think I asked on Slack a while ago and there was nothing.
Sys.free_memory()
could be used to implement this! No idea how reliable the number is… seems to roughly correlate with what the system monitor shows me!
Also, some cluster environments have a mechanism like cgroups
to control this, preventing your process from grabbing more memory and thus perhaps avoiding the forced termination.
As far as I understand it, this returns only the amount of free memory. I could use
Sys.total_memory() - Sys.free_memory() to compute the amount of used memory, however, this is the amount of used memory of all processes and I cannot identify how much is used by my program.
As suggested by @Tamas_Papp, cgroups is probably the best way to do it externally although it may kill the process. A nice small introduction to it can be found here: Everything You Need to Know about Linux Containers, Part I: Linux Control Groups and Process Isolation | Linux Journal
Inside julia, calling Base.summarysize(variable)
yields the size in bytes so one could construct a variable inspection mechanism that interrupts the process after preserving its state. This must most probably be built by hand as you will need to do the memory checking before/after the instructions that are responsible for allocating memory.
Good question. You should be able to measure how much memory is used by the julia
process periodically, and terminate your script if it uses too much, but I’m not sure how efficient Julia is at GCing and releasing unused memory. In some other languages/frameworks (like .NET), the garbage collector can be quite reluctant to kick in, resulting in the process using a lot more memory than is actually required, just because the memory happens to be available on the system and not requested by any other process. I’d love to learn more about how this works in Julia.
The other suggests seem good to me; in addition:
-
why wait until you run out of memory to write intermediate results to a file? That seems unnecessary. Why not just write your results to a file as you go (e.g. every minute)? That might help with memory too, because if you don’t need the older results, after they are written to a file they can be freed by GC
-
would it be worth considering MemPool.jl?
Thank you for your comments. I found a solution which works fine for me on Linux (Ubuntu 18.04), however, it could be possibly improved. The main idea is to access the file /proc/self/stat , and read the actual memory consumption from that file. E.g.,
function get_mem_use()
f = open( "/proc/self/stat" )
s = read( f, String )
vsize = parse( Int64, split( s )[23] )
mb = Int( ceil( vsize / ( 1024 * 1024 ) ) )
return mb
end
returns the virtual memory size in MB (see also: here). It would be nice to call that function on a regular basis in background which I did not achieve yet. However, calling it at critical points in the code is a nice work-around.