I’ve written a program in Julia involving heavy insertion and deletion of elements from many separate Dict containers. The memory usage gets bad, and I tracked down the cause (I believe) to the fact that Julia Dicts cannot be shrunk by either deleting keys or calling sizehint!(), so all the Dicts in my program are consuming a maximum amount of RAM needed over the whole duration of the run. Does Julia plan to implement the shrinking feature for Dict in the near future? If not I’ll try to restructure my program.
P.S. Are there alternative data structures (from third-party packages) which implements the AbstractDict interface while allowing shrinking?
The docs for Dictionaries.jl don’t mention shrinking. I tried inserting ~10^8 key-value pairs and then deleting all but one key. The memory usage didn’t decrease.
Right, but contrary to the standard Dict, removing entries from a Dictionary actually removes entries from its datastructure. You can manually run sizehint! on its internal fields to shrink memory. Maybe a method sizehint!(::Dictionary, ::Int) would help to make things less manual?
A cursory glance through rehash! makes me think it should already work, as it’s literally allocating new buffers and moving elements into them - is there a good way to confirm this? Maybe I’m missing something…
Thanks, I already looked at the source I was talking about confirming whether or not the existing rehash! would already work for the purpose of shrinking the Dict. The relevant parts of the code (both of rehash! as well as sizehint!) hasn’t changed in the last 9 years and the todo comment is still there, so I’d rather err on the side of caution and say jeff had a failure mode in mind when writing that code. Exhaustively checking all possible usages of dict shrinking isn’t feasible though…
If I recall correctly, we also don’t shrink the allocated memory of an array when it shrinks, so even if a Dict were to shrink it’s backing arrays, it might not free up any memory until that’s also done. As a workaround, the simplest fix for this at the moment might be just copying the Dict when it’s at its final size.