When to use sizehint! for Dict?

milesf · September 16, 2019, 11:17pm

Benchmarking shows that using sizehint! to match the expected number of Dict elements is best in almost all cases, except when the number of elements is at or near 10^5. Both 10^4 and 10^6 prefer sizehint!, but 10^5 strangely does not. This is a reproducible result on my system.

using BenchmarkTools

function bench_hint(n)
    @belapsed for i in 1:$n d[i] = i end setup = (
        d = Dict{Int, Int}();
        sizehint!(d, $n)
    )
end

function bench_no_hint(n)
    @belapsed for i in 1:$n d[i] = i end setup = (
        d = Dict{Int, Int}()
    )
end

#------- Interactive below ------------

# Run all benchmarks
julia> times = [(n, bench_no_hint(n), bench_hint(n)) for i in 1:8 for n = 10^i]
8-element Array{Tuple{Int64,Float64,Float64},1}:
 (10, 7.12299794661191e-8, 7.120739219712526e-8)     
 (100, 7.839081632653062e-7, 7.760636363636364e-7)   
 (1000, 1.2393333333333334e-5, 1.1199333333333335e-5)
 (10000, 0.000221074, 0.000163545)                   
 (100000, 0.002791825, 0.003224295)  # <- Unexpected
 (1000000, 0.088250659, 0.066421847)                 
 (10000000, 1.313739977, 0.800849544)                
 (100000000, 17.57806693, 15.449179262)              

# Show performance ratio. hinting best for everything except 10^5
julia> [(t[3] / t[2], t[1]) for t in times]
8-element Array{Tuple{Float64,Int64},1}:
 (0.9996828965954626, 10)       
 (0.9899930536901236, 100)      
 (0.9036578805809575, 1000)     
 (0.7397749169961189, 10000)    
 (1.1549058411612476, 100000)     # <- Unexpected
 (0.7526498697307179, 1000000)  
 (0.6095951695317863, 10000000) 
 (0.8788895459052619, 100000000)

# Reproduce the above result
julia> times = [(n, bench_no_hint(n), bench_hint(n)) for i in 4:6 for n = 10^i]
3-element Array{Tuple{Int64,Float64,Float64},1}:
 (10000, 0.000221071, 0.000163615)  
 (100000, 0.00277381, 0.003233641)    # <- Unexpected
 (1000000, 0.063075762, 0.055881177)

julia> [(t[3] / t[2], t[1]) for t in times]
3-element Array{Tuple{Float64,Int64},1}:
 (0.7401015963197344, 10000)  
 (1.165775954373227, 100000)    # <- Unexpected
 (0.8859374065112365, 1000000)

Topic		Replies	Views
Sizehint! a Dict with a BigInt size not possible? General Usage question	8	569	October 27, 2020
Do comprehensions implicitly use sizehint? General Usage	3	768	June 17, 2020
Proper way to make an empty sizehinted Dict General Usage question	1	416	December 1, 2020
Improving my mental model: Why is `sizehint!()` so much slower than `zeros()`? Performance	9	1549	March 31, 2022
Is Dict shrinking feature planned? Internals & Design dictionary , memory	10	1223	September 24, 2021

When to use sizehint! for Dict?

Related topics