I am trying to run my Julia code on a computing cluster running SLURM. It runs just fine on my own pc, but whenever I try to run it on the cluster (even when I give it the same amount of total memory), my job gets canceled because it uses too much memory. I am thinking the reason is that julia thinks it has access to all of the memory on the compute node (Sys.total_memory() always shows the total memory of the node, not the amount I assign to the job). As a result, it runs GC too infrequently. Is there a way to tell julia how much memory it has access to?
With julia v1.9 you can try setting --heap-size-hint=<size>
: julia/HISTORY.md at e48a0c99bef949a84979c05dc33fd5578f684c1d · JuliaLang/julia · GitHub. This isn’t exactly the same as what you asked, but it should help the garbage collector to work more eagerly than what it’d do by default.
2 Likes
What version of julia are you using? I seem to remember that there were some changes somewhat recently about how we query the memory.