Benchmarking shows that using sizehint!
to match the expected number of Dict elements is best in almost all cases, except when the number of elements is at or near 10^5. Both 10^4 and 10^6 prefer sizehint!
, but 10^5 strangely does not. This is a reproducible result on my system.
using BenchmarkTools
function bench_hint(n)
@belapsed for i in 1:$n d[i] = i end setup = (
d = Dict{Int, Int}();
sizehint!(d, $n)
)
end
function bench_no_hint(n)
@belapsed for i in 1:$n d[i] = i end setup = (
d = Dict{Int, Int}()
)
end
#------- Interactive below ------------
# Run all benchmarks
julia> times = [(n, bench_no_hint(n), bench_hint(n)) for i in 1:8 for n = 10^i]
8-element Array{Tuple{Int64,Float64,Float64},1}:
(10, 7.12299794661191e-8, 7.120739219712526e-8)
(100, 7.839081632653062e-7, 7.760636363636364e-7)
(1000, 1.2393333333333334e-5, 1.1199333333333335e-5)
(10000, 0.000221074, 0.000163545)
(100000, 0.002791825, 0.003224295) # <- Unexpected
(1000000, 0.088250659, 0.066421847)
(10000000, 1.313739977, 0.800849544)
(100000000, 17.57806693, 15.449179262)
# Show performance ratio. hinting best for everything except 10^5
julia> [(t[3] / t[2], t[1]) for t in times]
8-element Array{Tuple{Float64,Int64},1}:
(0.9996828965954626, 10)
(0.9899930536901236, 100)
(0.9036578805809575, 1000)
(0.7397749169961189, 10000)
(1.1549058411612476, 100000) # <- Unexpected
(0.7526498697307179, 1000000)
(0.6095951695317863, 10000000)
(0.8788895459052619, 100000000)
# Reproduce the above result
julia> times = [(n, bench_no_hint(n), bench_hint(n)) for i in 4:6 for n = 10^i]
3-element Array{Tuple{Int64,Float64,Float64},1}:
(10000, 0.000221071, 0.000163615)
(100000, 0.00277381, 0.003233641) # <- Unexpected
(1000000, 0.063075762, 0.055881177)
julia> [(t[3] / t[2], t[1]) for t in times]
3-element Array{Tuple{Float64,Int64},1}:
(0.7401015963197344, 10000)
(1.165775954373227, 100000) # <- Unexpected
(0.8859374065112365, 1000000)