Resizing arrays, when elements have primitive types or more generally isbits types, does not trigger any GC, as this is handled by the C runtime. Dictionaries are also allocation-free during most operations, except during rehashing, which happens if you insert more elements than the current capacity. rehash is called to allocate larger arrays inside the Dict object before data from the original arrays are copied over. The original arrays are then discarded and eventually garbage-collected.
It seems that with a little bit of extra work, Dict for primitive and isbits keys/values can completely avoid GC-tracked allocations. You just need to make a C call to malloc in the Dict constructor and call free in the rehashing function and the finalizer.
Would such changes be worthwhile, given that some Julia users want to write allocation-free code? Tooling for this purpose includes e.g. AllocCheck.jl.
Point taken. I have a codebase where I’ve eliminated almost all allocations, except for Dict (rehashing) allocations which are essentially unsolvable without radical change in algorithms. It shouldn’t be hard to create a fork of Base Dict, where the internal Memory objects (fixed-size arrays) holding the dictionary data are replaced by MallocArray from StaticTools.jl, since the allocation patterns in dict.jl are very simple.
Why would you do that? You don’t save on allocations, your only gain is that you can immediately free the memory on rehash / dict growth, and don’t have to wait for GC.
The main thing I could imagine is that you want to disable GC because you cannot afford latency spikes?
Due to the exponential growth of Dict sizings, you can just disable GC and leak that memory. The total overhead is at most a factor 2 on dict sizes (geometric series – amount of leaked memory is less than amount of used memory).