A few quick notes here — you may already know these things:
Julia’s GC doesn’t “run in the background” — simply setting A = nothing won’t trigger a GC run. What will happen is that the next time you allocate, GC will check to see what the “memory pressure” on the system is like and if it needs to look for abandoned objects.
Julia v1.9 should work significantly better in terms of identifying memory pressure in containerized systems
The goal isn’t to maintain a small memory footprint for the sake of a small footprint — it just wants to fit in the system.
Are you having trouble keeping Julia from OOM’ing the system?
To expand on this: measuring RAM usage is pretty complicated (Memory Measurements Complexities and Considerations - Part 1 looks like a good starting point if you want to understand more). On top of that, julia has no particular reason to give back RAM it has used if the OS doesn’t need it, so top will often give high numbers without it ever being an issue.
It just ends up doing a call to jl_gc_set_max_memory (which you could also just ccall from Julia) so it seems to me that it should indeed be callable during runtime.