I have a tricky problem bothering me:
I am using julia 1.5 with HTTP.jl and Mux.jl, I added multi-threading support in a manner as MusicAlbums.jl/Workers.jl at master · quinnj/MusicAlbums.jl (github.com), but compared to single-threaded server the garbage collector kicks in too infrequently. I have 9 threads as workers and 1 thread as the main, accepting the requests.
The problem is, it looks like I have memory leaks/ garbage collector is sometimes not kicking in much and it takes several hours or days to clean-up significant portion of memory.
In the processing of request I download data, perform inference using ensemble of neural networks and then update some table and return the result. When there are no incoming requests, server is taking ~30GB RAM, but when there are requests incoming, it goes up to 50GB and more.
This is chart of 2 replicas, orange max. ram, green average between them.
In the image in the left we can see that after ~4 hours of memory growing the GC performs some significant cleanup (~5GB of RAM), after another ~4 hours it cleans up another ~5GB of RAM, but then nothing, the memory just keeps growing.
When I had the 12 threads as workers, the memory grew even faster, and GC was not cleaning it up.
How does GC relate to the multithreading?
It seems if there are threads doing some work and are busy all the time, the GC waits and it can wait several days until it really cleans up larger portion of memory.
But I’m not sure if I’m interpreting it correctly.
Is there a way how to force the GC to run more often?
Is there a way how I can suspend all threads from time to time so GC would run more often?
Because from my observations, next 2 days after the last large cleanup only small portions of memory get cleaned up, nothing big.
I would rather e.g. have 1 second when server waits and cleans up the memory than this infinite growth without significant cleanup.