Could the usage of `haskey` be unidiomatic/inefficient?

It seems to be a common pattern to me that something like this needs to be done:

if !haskey(dict, key)
  ...
else
  item = dict[key]
  ...
end

Would this lead to two lookups and thus cause extra overhead?

I considered using get/3 but it can only go so far as to provide a default value. In cases where more complicated operations instead of just getting a value is needed, I still need to write

item = get(dict, key, nothing)
if item == nothing
  # do stuffs
else
  # do stuffs with item
end

which still involves an extra comparison operation and doesn’t seem to be sensible code at all.

AFAIK in languages such as Rust there seem to be special constructs that help deal with such cases with a Dict.

What’s the idiomatic way to do so? Or is it just simply not a performance concern due to some sort of optimization, and I shouldn’t worry about using !haskey in this scenario?

See julep/rfc: proposal for unified Dict get/insert/has interface · Issue #12157 · JuliaLang/julia · GitHub

2 Likes

What about

function dostuff(dict, key)
    found = true
    item = get(() -> found = false, dict, key)
    if found
        Some(item)
    else
        nothing
    end
end

d = Dict([1 => 'a'])
dostuff(d, 1)                   # Some('a')
dostuff(d, 2)                   # nothing

Only you can Profile.@profile whether this is a perf problem in your specific code. This does cause two lookups, but the first one will fetch the relevant cache-line. So the second lookup only costs CPU and no mem-traffic nor does it have to wait an eternity for main memory.

If profiling shows that the second lookup is a significant part of your time, then my recommendation is Base.ht_keyindex2! (look up position of key; -pos if not present; may rehash dict to make space for new element) and Base.ht_keyindex (look up position of key, -1 if not present).

The optimizer might be capable of removing the second hash computation. I don’t think that the optimizer will be capable of removing the comparisons (the linear search of all colliding elements).

1 Like

I don’t see what is wrong with:

item = get(dict, key, nothing)
if item === nothing
  # do stuffs
else
  # do stuffs with item
end

Only one lookup and === nothing is fast.

2 Likes

The only issue these days is if nothing is a valid value. Otherwise that is a fine way to express this in Julia 1.0+.

1 Like

You could always just define your own

struct MySentinel end

and use that instead of Nothing. There are not specific optimisations for the nothing singleton that doesn’t apply to others.

1 Like