To get the best performance for my Julia RPC server in production, I turn off GC during the execution of each request and manually call the GC after returning the response. Each Julia RPC server process only handles one request at a time, so its relatively straightforward to do this. We also pre-allocate nearly everything we can, so going OoM is normally not a concern without the GC on within the context of a single request. However, sometimes precompile doesn’t cover everything and the first time each RPC is executed JIT compile is needed, sometimes allocating more than 1GB. Multiply this across a number of workers behind a load balancer and it starts to become potentially problematic after rolling out an update.
This brings me to my question: Is there an idiomatic/standard way to always re-enable the GC whenever JIT compile happens?
Right now my workaround is to use a lookup table to remember whether an RPC has been called before and leave the GC on for the first time each method is called. Since the methods type signature is always the same, this is mostly effective in practice. But I was wondering if maybe there is a truly “correct” solution which doesn’t rely on guessing/heuristics. Making sure I have 100% precompile coverage of course should work too, but its not always trivial to do on a deadline. Another option is to only disable the GC during performance critical sections which is what we did previously, but performance isn’t as good.
This brings me to my question: Is there an idiomatic/standard way to always re-enable the GC whenever JIT compile happens?
Imo the idiomatic way to make sure GC is enabled whenever JIT compile happens is to not disable GC in the first place. GC calls only occur after allocating a specified amount of memory, so if your code doesn’t allocate, GC won’t be called in the first place.
In the medium term, we are hoping to expand our integration with the MMTK garbage collector to allow using a concurrent garbage collector, which should help with some of these concerns.
I’m thinking I might be able to achieve what I want by leaving the GC on but manually calling it after every request finishes. The issue is its infrequently running inside the middle of a request. Not every request, maybe every 5 or so.
The MMTK stuff sounds promising, maybe it will be possible for memory bound applications to use a similar algorithm as Go’s green tea. Or maybe a way to use arena allocators with more than just array types. The current state of things are more than good enough for me, more or less I’m just trying to make sure things don’t explode in production even if it is infrequent / unlikely.