I’m trying to do random-access updates to Int
values in a dictionary in the most efficient way possible, but it seems hard to avoid extra hash
calls. Here’s the naive approach:
struct A
x::Int
end
function Base.hash(a::A, h::UInt)
println("Hashing an instance of `A`.")
hash(:A, hash(a.x, h))
end
function increment1!(d, k)
if k in keys(d)
d[k] = d[k] + 1
else
d[k] = 1
end
d
end
d = Dict(A(1) => 1)
julia> increment1!(d, A(1))
Hashing an instance of `A`.
Hashing an instance of `A`.
Hashing an instance of `A`.
Dict{A, Int64} with 1 entry:
A(1) => 2
julia> increment1!(d, A(2))
Hashing an instance of `A`.
Hashing an instance of `A`.
Dict{A, Int64} with 2 entries:
A(2) => 1
A(1) => 2
With this approach, we get three calls to hash(::A)
when incrementing the value for a key that already exists, and two calls to hash(::A)
for that does not exist in the dictionary.
The following version is a little smarter and only does two calls to hash(::A)
for both cases:
function increment2!(d, k)
d[k] = get(d, k, 0) + 1
d
end
d = Dict(A(1) => 1)
julia> increment2!(d, A(1))
Hashing an instance of `A`.
Hashing an instance of `A`.
Dict{A, Int64} with 1 entry:
A(1) => 2
julia> increment2!(d, A(2))
Hashing an instance of `A`.
Hashing an instance of `A`.
Dict{A, Int64} with 2 entries:
A(2) => 1
A(1) => 2
I can get it down to one call to hash(::A)
if I make the values mutable by wrapping them in a Ref
, like this:
function increment3!(d, k)
v = get!(d, k, Ref(0))
v[] = v[] + 1
d
end
d = Dict(A(1) => Ref(1))
julia> increment3!(d, A(1))
Hashing an instance of `A`.
Dict{A, Base.RefValue{Int64}} with 1 entry:
A(1) => RefValue{Int64}(2)
julia> increment3!(d, A(2))
Hashing an instance of `A`.
Dict{A, Base.RefValue{Int64}} with 2 entries:
A(2) => RefValue{Int64}(1)
A(1) => RefValue{Int64}(2)
However, I would prefer not to wrap my Int
s in a Ref
if I don’t have to. Is there a way to use Int
values and get the hash calls down to 1 for both cases (key already exists and key does not already exist)?
(I’m aware that my increment
functions are similar to StatsBase.countmap
. However, this is a question about working with dictionaries.)