Can we reuse Dictionary allocation?

CodeGodz · August 10, 2022, 12:54pm

In our application we have to allocate over 20K dictionaries each having ~1M keys. Each of them is created in a for loop and since they are only used within the iteration garbage collection temporarily freezes the code when memory fills.

Since we know the maximum number of keys the dictionary can have we now pre-allocate a dictionary and reuse it in the for loop. To know whether that value was set in the current iteration (and not a previous one) we use Tuple{Int16, Int64} as a key where Int16 will be the iteration ID. Simplified it looks like this:

function reuse_dict()
    # Prealloc dict
    d = Dict{Int64, Tuple{Int16, Int64}}()
    
    # Some numbers to test with
    numbers = rand(1:10_000, 2_000_000)
    sizehint!(d, 2_000_000)
    
    # Track something so compiler knows we
    # do something
    found = 0
    
    # fill it
    for i in 1:100
        # view just to make sure it's changing each iteration so it won't get outcompiled
        s = rand(1:100)
        e = rand(1_500_000:2_000_000)
        v = view(numbers, s:e) # just to change each iter
        for (j, numb) in enumerate(v)
            d[numb] = (i, j) # i = origin iter
        end
        
        # Call some f(), just to not get outcompiled
        if get(d, 1, 0)[1] == i
            found +=1
        end
        
    end
    return found
end

This requires me to use a Tuple{Int16, Int64}, while not that much more than just Int64 it made me curious if there is some other smart trick to reuse allocated dictionaries?

ericphanson · August 10, 2022, 1:04pm

I think you could just empty! it at the end of each loop iteration. My understanding is that it won’t shrink the internal vectors. In fact, there’s no way to shrink the internal vectors until julia 1.9, when sizehint! will be able to shrink them, thanks to https://github.com/JuliaLang/julia/pull/45004.

CodeGodz · August 10, 2022, 1:26pm

This works, was curious if reusing (without empty!) would help in hashing the keys but that does not seem to be the case (similar execution time and alloc graph). Thanks for this, this is much more elegant

Topic		Replies	Views
How can I pre-allocate a dict? General Usage	5	689	April 4, 2023
Allocations in the loop New to Julia	4	401	January 27, 2022
Lookup in Dict{Int,Float64} allocates Performance dictionary , memory-allocation	2	496	January 31, 2022
Unexpected allocations when accessing IdDict Performance	9	616	August 30, 2021
haskey(Dict) allocates when key is a struct Performance	4	532	December 10, 2019

Can we reuse Dictionary allocation?

Related topics