I found that accessing an IdDict can lead to memory allocations, while accessing a Dict does not. I first encountered this behaviour in combination with the Memoization.jl package, therefore the original thread can be found here: https://github.com/marius311/Memoization.jl/issues/19
However, as marius311 showed with an MWE, the allocations do not result from Memoization.jl, but rather from the get! method of the IdDict. Please find the MWE below:
const cache_dict = Dict{Tuple{Vector{Int},Int},Int}()
const cache_iddict = IdDict{Tuple{Vector{Int},Int},Int}()
# this is basically what @memoize expands to:
function func_dict(ctr, a)
_getter() = a + 1
get!(_getter, cache_dict, (ctr, a))
end
function func_iddict(ctr, a)
_getter() = a + 1
get!(_getter, cache_iddict, (ctr, a))
end
function main()
ctr = Int[0]
for i = 1:10
println("Run $i")
println("Using Dict: $(@allocated func_dict(ctr, 1))")
println("Using IdDict: $(@allocated func_iddict(ctr, 1))")
end
end
main()
When you run this MWE, you’ll find that after the initial compilation run, the IdDict variant allocates 64 bytes for i = 1 and 32 bytes for each subsequent loop run. The Dict variant on the other hand allocates no memory.
My guess is that it is the creation of the (ctr, a) tuple (on the heap) that doesn’t get elided because you need to actually form the object to compute its objectid.
Edit: This might not be true, cf the simpler example:
const cache_dict = Dict{Tuple{Vector{Int},Int},Int}()
const cache_iddict = IdDict{Tuple{Vector{Int},Int},Int}()
function func_dict(d)
_getter() = 1
get!(_getter, cache_dict, d)
end
function func_iddict(d)
_getter() = 1
get!(_getter, cache_iddict, d)
end
function main()
ctr = Int[0]
d = (ctr, 1)
for i = 1:10
println("Run $i")
println("Using Dict: $(@allocated func_dict(d))")
println("Using IdDict: $(@allocated func_iddict(d))")
end
end
main()
Please ignore the last post, it was stupid. Nevertheless, I still think that the implementation of get! is causing those allocations. Could the creation of val by the ccall on line 176 cause the allocations, assuming that the dict returns a tuple?
Edit: @kristoffer.carlsson is probably right with his assumption that the objectid calculation is what is actually causing the allocations. Consider an example where an integer is used as key:
const cache_dict = Dict{Int,Int}()
const cache_iddict = IdDict{Int,Int}()
function func_dict(d)
_getter() = 1
get!(_getter, cache_dict, d)
end
function func_iddict(d)
_getter() = 1
get!(_getter, cache_iddict, d)
end
function main()
d = 1
for i = 1:10
println("Run $i")
println("Using Dict: $(@allocated func_dict(d))")
println("Using IdDict: $(@allocated func_iddict(d))")
end
end
main()
as well as the calculation of the objectid of a tuple:
After some further testing, I found that @kristoffer.carlsson is definitely right about the issue. Consider the following MWE:
a = Int[0, 2, 3, 4]
b = Float64[1.0, 2.0, 45.0]
@allocated objectid((a,b))
@allocated objectid((a,b))
The first run results in 160 allocations, all subsequent ones in 32 allocations. I guess this is unavoidable due to the tuple formation. Therefore, I marked the answer of @kristoffer.carlsson as solution. Thanks to all for the discussion!