@time gives the total memory allocation of a piece of code, but because of garbage collection I can have code that allocates much more memory than my system has RAM. Is there a way to see the maximum working memory of a piece of code (i.e., the minimum amount of RAM a computer will need to be able to run the code)? (I need to know how much memory to request from a computing cluster.)
This would be really handy and something that we probably can already easily obtain, as the GC should know this, no?
I don’t think this is likely to be easy to compute.
GC will run when it’s triggered, and if not enough memory is freed then Julia will request more memory from the OS. If no more memory is available then it will crash.
But required memory can depend a lot on what the program tries to do, which might be user input dependent.
Even if you’re doing a single calculation which is determined ahead of time, the amount of memory needed at peak can depend on the RNG or transiently go high but fall back down quickly. Possibly if more memory is available GC is called less and so peak usage goes higher than on a smaller machine where GC is called more and maximum size stays low, etc.
Basically it seems like you can get very rough estimates but not really any tight bounds.
For example, suppose a previous GC ended with 2GB and a full GC is called and garbage is collected and only 2GB of RAM is in use after Does that mean at no time between the last full GC and this full GC did we need more than 2GB? NO it’s possible at some point we were using a lot more than 2GB of non garbage. Perhaps for example we needed 4GB just a moment ago, then calculated some reduced quantity, turning the large temporary array into garbage, then triggered a print to a file which triggered GC and then the big array was collected.