# Tell if a key is present in a dict

This is a very beginner question. I would like to check if a key exists in a dictionary.

``````julia> 1 in Dict(1=>'a', 2=>'b')
ERROR: AbstractDict collections only contain Pairs;
Either look for e.g. A=>B instead, or use the `keys` or `values`
function if you are looking for a key or value respectively.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] in(::Int64, ::Dict{Int64,Char}) at ./abstractdict.jl:28
[3] top-level scope at REPL[9]:1
``````

This makes sense, a dictionary contains Pairs and not keys.

I can do this:

``````julia> 1 in keys(Dict(1=>'a', 2=>'b'))
true
``````

But I think `keys()` returns an iterator, and the `in` operator iterates over all keys until it finds the key. AFAIK iterators only support sequential iteration. Using a linear search to find a key in a hashmap is crazy.

So how can I tell if a key is present in a Dict?

There is `haskey`, as in:

``````julia> d = Dict(1 => 10)
Dict{Int64,Int64} with 1 entry:
1 => 10

true

false
``````
10 Likes

Use the `haskey` function.

2 Likes

This is actually efficient because `keys` returns a `KeysIterator` which is a type of set and has efficient O(1) containment checking.

13 Likes

I thought @StefanKarpinski comment was interestingâŚ and it looks like the âinâ form is faster than haskeyâŚ

Please tell me what I did wrong in this microbenchmark
edit: and i found my mistake almost as soon as i postedâŚ new code.

``````function f1(d, l)
x = true
for i=1:length(l)
x = x && haskey(d, l[i])
end
x
end

function f2(d, l)
x = true
for i=1:length(l)
x = x && l[i] in keys(d)
end
x
end

function main()
n = 500000
d = Dict{Int, Char}()
for i=1:n
d[rand(1:n)] = Char(rand(Int('a'):Int('z')))
end
# be sure and include some look-ups that will fail.
l = rand(1:n+1000, n)
@time f1(d, l)
@time f2(d, l)
end

``````
``````julia> main()
0.000234 seconds
0.000233 seconds
false

julia> main()
0.000234 seconds
0.000154 seconds
false

julia> main()
0.000233 seconds
0.000117 seconds
false

julia> main()
0.000233 seconds
0.000117 seconds
false

julia> main()
0.000233 seconds
0.000117 seconds
false

``````

That is very good to know. Iâm a newcomer to Julia and multi dispatch, and I donât know how to look for documentation. In a single-dispatch OOP language, I could simply look up the method of a class (and the class would already be given). But with Julia, I find it hard to find documentation for things. For example, the `keys` function has this documentation ( https://docs.julialang.org/en/v1/base/collections/#Base.keys )

For an iterator or collection that has keys and values (e.g. arrays and dictionaries), return an iterator over the keys.

It does not (and probably cannot?) tell what kind of iterator is returned, because it depends on the various methods, and they can be added by external packages. I can only be sure about one thing: it returns an iterator. I already read most the manual and I can recall that iteration was described at âInterfacesâ ( https://docs.julialang.org/en/v1/manual/interfaces/# ) but it also does not define what an iterator is. Of course, there are examples showing iterations over various data types, but that is not a definition.

I can also see that there are multiple methods (by looking at `methods(keys)`) but that doesnât help either, because it is just a list of line numbers in the source code, and they donât have method-specific documentation (or at least I donât know where).

How am I supposed to know, that the iterator of a `Dict` is a `KeysIterator` and that it can be efficiently tested for key containment? What is a `KeysIterator` anyway? I just tried to search for it in the official docs, but nothing was found.

After all, I remain uncertain and I have to ask lots of questions here. Iâm constantly asking for help and that is uncomfortable. Am I using the documentation the wrong way? Is there a better way to use it?

2 Likes

Try to change the order of f1 and f2 in your benchmark

this is partly mitigated by `methodswith` but everyone agrees with your statement to some degree for sure.

Also btw, the first thing mentioned in https://docs.julialang.org/en/v1/base/collections/#Iterable-Collections-1

is

To test for the presence of a key in a dictionary, use `haskey` or `k in keys(dict)` .

at this point I think iterator has less to do with Julia but just a general nomenclature in programming,

compare to this python docs about iterator, which doesnât even have a `for loop` (and its equivalence) example
https://docs.python.org/3/c-api/iterator.html

2 Likes

That is the C API. Iterator protocol is documented here: https://docs.python.org/3/library/stdtypes.html#iterator-types#iterator-types but Iâm being offtopic.

1 Like

Perhaps this older blog post would be of interest to you: https://julialang.org/blog/2018/07/iterators-in-julia-0.7/

Ultimately, it almost doesnât matter that iterating over keys gives you a `KeysIterator`. Highly unlikely that type should actually appear in your code, even if your code makes use of it. The `keys` functions returns an iterator, iterators in Julia adhere to a common interface, and things âJust Workâ if you donât overtype your code.

2 Likes

I agree with your comments here that sometimes the documentation is sparse. Itâs a rapidly evolving project where the main Julia folks are often busy with more âseriousâ work like threading support and compiler worker. I suggest that if you have learned something from this discourse thread, that you may submit a PR to the documentation in the relevant section.

Offtopic: I remember that a long time ago MSDN used to have a âcomments/snippetsâ section at the end of their doc pages where folks would submit snippets of code on the particular topic. I think something like that for Julia would be great. Maybe not particularly relevant to `haskey` or `Dict`s but can be extremely useful for packages like `Query`.

1 Like

Good idea. I did and the run time stayed the same ??!! The second function seems to run faster regardless if itâs f1 or f2.

The part in the manual about the iterator interface is the definition of an iterator â anything that implements that interface is an iterator.

You donât actually need to know that to use `keys(dict)`. The point is to abstract away these implementation details. `?keys` documents the actual usage â it even has a sentence about `Dict`.

To be fair, the documentation can always be improved. But it is frequently useful to look at the âverbsâ (functions) like `keys` in addition to the ânounsâ (types).

It is not easy to adjust from an OOP language to multimethods, especially if you are very experienced with OOP. Just give it time.

1 Like

Use `@btime` from BenchmarkTools.jl for microbenchmarks like this instead of `@time`.

1 Like

Take my example, replace @time with @btime and got a very, very ugly stacktrace out of the attempt.

Thatâs the problem with macros, very hard to figure out whatâs going on.

I read through the manual and saw,

A good rule of thumb is that external variables should be explicitly interpolated into the benchmark expression

so i used

@btime f1(\$d, \$l)

and that fixed the problem, and now f1/f2 give the same run time

julia> main()
116.587 Îźs (0 allocations: 0 bytes)
116.584 Îźs (0 allocations: 0 bytes)
false

3 Likes