Is Dict shrinking feature planned?

I’ve written a program in Julia involving heavy insertion and deletion of elements from many separate Dict containers. The memory usage gets bad, and I tracked down the cause (I believe) to the fact that Julia Dicts cannot be shrunk by either deleting keys or calling sizehint!(), so all the Dicts in my program are consuming a maximum amount of RAM needed over the whole duration of the run. Does Julia plan to implement the shrinking feature for Dict in the near future? If not I’ll try to restructure my program.

P.S. Are there alternative data structures (from third-party packages) which implements the AbstractDict interface while allowing shrinking?

1 Like
2 Likes

The docs for Dictionaries.jl don’t mention shrinking. I tried inserting ~10^8 key-value pairs and then deleting all but one key. The memory usage didn’t decrease.

Looks like shrinking is still a todo:

https://github.com/JuliaLang/julia/blob/2c9e051c460dd9700e6814c8e49cc1f119ed8b41/base/dict.jl#L238-L242

Right, but contrary to the standard Dict, removing entries from a Dictionary actually removes entries from its datastructure. You can manually run sizehint! on its internal fields to shrink memory. Maybe a method sizehint!(::Dictionary, ::Int) would help to make things less manual?

A cursory glance through rehash! makes me think it should already work, as it’s literally allocating new buffers and moving elements into them - is there a good way to confirm this? Maybe I’m missing something…

Ah, I just checked the git log - seems that code is still from 2012 and hasn’t really seen all that many changes since then.

If you compile Julia from source, you can do Revise.track(Base), then edit the file and see the changes.

Thanks, I already looked at the source :slight_smile: I was talking about confirming whether or not the existing rehash! would already work for the purpose of shrinking the Dict. The relevant parts of the code (both of rehash! as well as sizehint!) hasn’t changed in the last 9 years and the todo comment is still there, so I’d rather err on the side of caution and say jeff had a failure mode in mind when writing that code. Exhaustively checking all possible usages of dict shrinking isn’t feasible though…

If I recall correctly, we also don’t shrink the allocated memory of an array when it shrinks, so even if a Dict were to shrink it’s backing arrays, it might not free up any memory until that’s also done. As a workaround, the simplest fix for this at the moment might be just copying the Dict when it’s at its final size.

1 Like

sizehint! will shrink the allocated memory (if the decrease is sufficiently large); see sizehint! now supports shrinking arrays #2879 by adrianisuru · Pull Request #26201 · JuliaLang/julia · GitHub (which closed an issue you filed in 2013 :wink:). However, this isn’t done automatically by pop!, empty!, delete!, etcetera, all of which keep the allocated memory fixed.

So it would nowadays make sense for sizehint! of a Dict to also support shrinking the storage.

7 Likes