I am trying to run my Julia code on a computing cluster running SLURM. It runs just fine on my own pc, but whenever I try to run it on the cluster (even when I give it the same amount of total memory), my job gets canceled because it uses too much memory. I am thinking the reason is that julia thinks it has access to all of the memory on the compute node (Sys.total_memory() always shows the total memory of the node, not the amount I assign to the job). As a result, it runs GC too infrequently. Is there a way to tell julia how much memory it has access to?
2 Likes
With julia v1.9 you can try setting --heap-size-hint=<size>
: julia/HISTORY.md at e48a0c99bef949a84979c05dc33fd5578f684c1d · JuliaLang/julia · GitHub. This isn’t exactly the same as what you asked, but it should help the garbage collector to work more eagerly than what it’d do by default.
What version of julia are you using? I seem to remember that there were some changes somewhat recently about how we query the memory.