Poor performance of garbage collection in multi-threaded application

I have checked the discussion in Garbage collection not aggressive enough on Slurm Cluster - Specific Domains / Julia at Scale - JuliaLang
specifically the

thr=0.02
rand(Uniform(0,1)) < thr && GC.gc();

And I’m trying now something similar now. Interestingly, when running the GC.gc() approximately every 100s, it does not seem to perform some deep cleaning. The low part on left is when there are no incoming requests, and then it grows under load and stays there.

I’ll also try the Base.gc_bytes() if it would give some more information.

I will yet try to sprinkle more GC.safepoint().