Updating values in a dictionary with a minimum of lookups

I’m trying to do random-access updates to Int values in a dictionary in the most efficient way possible, but it seems hard to avoid extra hash calls. Here’s the naive approach:

struct A
    x::Int
end

function Base.hash(a::A, h::UInt)
    println("Hashing an instance of `A`.")
    hash(:A, hash(a.x, h))
end

function increment1!(d, k)
    if k in keys(d)
        d[k] = d[k] + 1
    else
        d[k] = 1
    end
    d
end

d = Dict(A(1) => 1)
julia> increment1!(d, A(1))
Hashing an instance of `A`.
Hashing an instance of `A`.
Hashing an instance of `A`.
Dict{A, Int64} with 1 entry:
  A(1) => 2

julia> increment1!(d, A(2))
Hashing an instance of `A`.
Hashing an instance of `A`.
Dict{A, Int64} with 2 entries:
  A(2) => 1
  A(1) => 2

With this approach, we get three calls to hash(::A) when incrementing the value for a key that already exists, and two calls to hash(::A) for that does not exist in the dictionary.

The following version is a little smarter and only does two calls to hash(::A) for both cases:

function increment2!(d, k)
    d[k] = get(d, k, 0) + 1
    d
end

d = Dict(A(1) => 1)
julia> increment2!(d, A(1))
Hashing an instance of `A`.
Hashing an instance of `A`.
Dict{A, Int64} with 1 entry:
  A(1) => 2

julia> increment2!(d, A(2))
Hashing an instance of `A`.
Hashing an instance of `A`.
Dict{A, Int64} with 2 entries:
  A(2) => 1
  A(1) => 2

I can get it down to one call to hash(::A) if I make the values mutable by wrapping them in a Ref, like this:

function increment3!(d, k)
    v = get!(d, k, Ref(0))
    v[] = v[] + 1
    d
end

d = Dict(A(1) => Ref(1))
julia> increment3!(d, A(1))
Hashing an instance of `A`.
Dict{A, Base.RefValue{Int64}} with 1 entry:
  A(1) => RefValue{Int64}(2)

julia> increment3!(d, A(2))
Hashing an instance of `A`.
Dict{A, Base.RefValue{Int64}} with 2 entries:
  A(2) => RefValue{Int64}(1)
  A(1) => RefValue{Int64}(2)

However, I would prefer not to wrap my Ints in a Ref if I don’t have to. Is there a way to use Int values and get the hash calls down to 1 for both cases (key already exists and key does not already exist)?

(I’m aware that my increment functions are similar to StatsBase.countmap. However, this is a question about working with dictionaries.)

I like Dictionaries.jl which has gettoken for this.

2 Likes

Someone needs to remake Add modify! function for lookup/update/insert/delete in one go by tkf · Pull Request #33758 · JuliaLang/julia · GitHub and get it merged. I also think we might be able to lower d[k] += 1 to modify!(+, d, 1, k...).

1 Like