How efficient is dictionary lookup in Julia?

I have an application where I have to lookup keys in a dictionary several times, and depending on the particular case it could be that I lookup the same key hundreds of times. How much overhead should I expect?

In Python dictionary lookup complexity is O(1) or so I’ve heard. What about Julia?

Its pretty efficient.

using BenchmarkTools
using Random

mydict = Dict(i=>randstring(3) for i in 1:n);

@btime mydict[rand(1:n)]   # counts the random generation as well
# 117.599 ns (1 allocation: 47 bytes)
@btime mydict[433]
# 20.692 ns (0 allocations: 0 bytes)

Yes, Julia Dictionaries are implemented as Hash Tables just like in Python, so it has O(1) lookup complexity (outside of cases with lots of collisions - when many keys have the same hash, which is usually handled internally as a linked list). I think it’s mostly functional languages that occasionally implements Dicts/Maps as balanced binary trees, which gives O(log n) lookup, but plays very nice with the concept of immutable data structures.

You could write a caching wrapper.

mutable struct CachingLookup{S,T}

CachingLookup(d::Dict) = CachingLookup(first(d)..., d)

function Base.getindex(c::CachingLookup, key)
    c.key == key && return c.last_value
    c.last_value = c.dict[key]

(this is an untested sketch, does not handle corner cases and could be made more elegant)

1 Like

Compared to python, dictionary lookup is very fast.

Compared to specialized implementations, it’s not so fast. Part of it is that hashing and equality are complicated in julia (e.g. 0x01==1 and therefore both must have the same hash). Another part is that default hashing is not very specialized:

julia> struct x
julia> @btime hash($x(1));
  33.641 ns (0 allocations: 0 bytes)

This is ridiculously slow, because it involves a non-inlined runtime call to jl_object_id. At some point in the future it will probably get faster (someone writes an if @generated fallback that treats objects as blobs of memory).


So if I want a fast implementation in theory I could provide my own key type and then implement a fast version of hash for it?


Yes, or consider using