I saw the juliacon22 video from Christine Flood about Julia GC. I’m glad to see that the GC is finally getting some attention and not just the compiler. It’s long overdue.
Let’s just address the elephant in the room: The Julia GC has a memory fragmentation problem. There I said it. I brought this possibility up (I wasn’t sure at that time) on this forum a few times over the years (all the way back since julia 0.2) but this was always brushed aside. The standard answer seems to be: Julia GC doesn’t have a problem, it’s you (i.e. you have a memory leak). My code didn’t, but proving that was pretty impossible due to the lack of tooling to find memory leaks in user code.
Anyway it’s good to see that Christine finally acknowledged the fact that even without a user memory leak Julia can take more and more of host memory until you run out (I’ve experienced this myself). This is of course heap fragmentiation although she never uses that term. Admitting that you have a problem is the start of recovery
Unfortunately the immediate things planned seem to be about performance (parallel GC, shorter collect time) rather than fixing fragmentation. Due to the current constraint of non-movable heap memory I’m not sure if this problem can ever be fully solved. To fix it completely you’ll probably have to relax that constraint. Hopefully the future GC work will also try to address (or at least mitigate) fragmentation.
What I do now is adapt my code to avoid fragmentation as much as possible. This is also good for performance. So instead of using push!/append! and only try to use just enough memory as required (like you would in languages like Java/C# with GCs that don’t have fragmentation issues) I preallocate a fixed amount of memory (more than strictly needed sometimes) and initialize everything with zeros. That’s of course an old school trick. That solved an out of memory problem I had in an application.
I feel that in the documentation push! and append! (and similar functions) should have a warning saying that the use of these for large number of elements (tens of thousands) will cause heap fragmentation, slow run time and possibly an out of memory. A warning like “push!/append! considered dangerous” might not be out of place.
Next I’d like to see some tooling that lets you look at the heap structure (like you have for JVM/CLR) and detect where a lot of allocations/deallocations happen from (I believe from one of the presentations there is in the profiler now some info about allocations). Ideally in a graphical way and integrated into the VSCode tooling.
But of course the best solution would be a better GC that does’t take more memory than needed. A region based GC would be sweet!
There is now a new argument to julia for a memory limit. That’s a step in the right direction but it’s not a hard limit so it’s still possible to run out of host memory and crash the host itself. That’s bad for servers! I’d like to see another option that is a hard heap size limit just like for the JVM where it the julia heap would need more than the specified limit it exits with an OutOfMemory exception. So only julia exits cleanly and you host server never crashes. I’ m sure sysadmins will appreciate that