Preferred implementation of a lookup table

Here is some code that studies performances of some implementations of effectively a lookup table.

#!/usr/bin/env julia

import BenchmarkTools

module Enums
    @enum COLORS begin
        RED
        GREEN
        BLUE
        BROWN
    end
    
    global const MY_DICT::Dict{COLORS, Float64} = Dict(RED => 42.0, GREEN => 54.0, BLUE => 65.0, BROWN => 45.0)

    function lookup_global(color::COLORS)::Float64
        return MY_DICT[color]
    end

    function doit_global()::Nothing
        for color::COLORS in instances(COLORS)
            lookup_global(color)
        end

        return nothing
    end
    
    
    function lookup_local(color::COLORS)::Float64
        my_dict::Dict{COLORS, Float64} = Dict(RED => 42.0, GREEN => 54.0, BLUE => 65.0, BROWN => 45.0)
        return my_dict[color]
    end
        
    function doit_local()::Nothing
        for color::COLORS in instances(COLORS)
            lookup_local(color)
        end

        return nothing
    end
end

# abstract types

module Abstract
    abstract type AbstractColors end
    
    struct RED   <: AbstractColors end
    struct GREEN <: AbstractColors end
    struct BLUE  <: AbstractColors end
    struct BROWN <: AbstractColors end
    
    lookup(::RED)   = 42.0
    lookup(::GREEN) = 54.0
    lookup(::BLUE)  = 65.0
    lookup(::BROWN) = 45.0    
    
    global const COLORS_VEC = [ RED(), GREEN(), BLUE(), BROWN()]
    
    function doit_global()::Nothing
        for color::AbstractColors in COLORS_VEC
            lookup(color)
        end

        return nothing
    end    

    function doit_local()::Nothing
        my_vec::Vector{AbstractColors} = [ RED(), GREEN(), BLUE(), BROWN()]
        
        for color::AbstractColors in my_vec
            lookup(color)
        end

        return nothing
    end       
end

println("\n\nEnums.doit_global()")
BenchmarkTools.@btime Enums.doit_global()

println("\n\nEnums.doit_local()")
BenchmarkTools.@btime Enums.doit_local()


println("\n\nAbstract.doit_global()")
BenchmarkTools.@btime Abstract.doit_global()

println("\n\nAbstract.doit_local()")
BenchmarkTools.@btime Abstract.doit_local()

Output:

Enums.doit_global()
  24.299 ns (0 allocations: 0 bytes)


Enums.doit_local()
  533.651 ns (16 allocations: 1.88 KiB)


Abstract.doit_global()
  54.372 ns (0 allocations: 0 bytes)


Abstract.doit_local()
  79.262 ns (1 allocation: 80 bytes)

I know that Julia can infer types. I know that global variables are not such a good idea and why. Never the less. Abstract.doit_local() is a good representation of how in my company typically lookup tables are implemented. There is always some container with objects of children of an abstract type and there is dispatch. I know that containers with abstract types cause a performance penalty due to optimization issues with the compiler and memory.

An alternative could be to use an enum. It prevents the wild growth of functions and structs. To my surprise, Enums.doit_global() is the fastest even though a const global variable is used. Developers appreciate Julia for its speed and if this way is fastest, people would be interested in an analysis on this - probably some low-level compiler optimization is going on.

Questions:

  • What is the preferred way to implement a lookup table in Julia?
  • What are experts opinions on using dispatch for a lookup table?
  • Why is the Enums.doit_global() fastest?

If it is truly a constant lookup table and you don’t need to change it during the runtime of the program you could use Base.ImmutableDict for a small number of entries and Base.PersistentDict for a large number of entries (after Julia v1.11)

1 Like

It’s no surprise that lookup_local is slow, since it builds a whole new dictionary for each time it looks something up. If you instead pass the dictionary as an input argument, like this:

function lookup_local(my_dict::Dict, color::COLORS)
    return my_dict[color]
end

then lookup_local and lookup_global are equally fast.

Globals are not slow if they are const. Remember, functions and structs, and other constants, like pi, are global variables. It’s the non-const globals that are problematic.

As for all your type annotations, you should be a bit careful. First of all, left-hand-side annotations are not really idiomatic, and at worst can lead to slow-down (or even bugs), because you may force-convert into something less performant. See e.g.

julia> t1 = rand(128);

julia> @btime sum($t1);
  30.746 ns (0 allocations: 0 bytes)

julia> t2::Vector{AbstractFloat} = rand(128);

julia> @btime sum($t2);
  3.362 μs (127 allocations: 1.98 KiB)

For example,

is redundant, and odd to me. The correct type will be inferred from the right-hand-side, so the assertion is superfluous. If you really want enforce the type of the dictionary, it is more idiomatic to write

my_dict = Dict{COLORS, Float64}(RED => 42.0, GREEN => 54.0, BLUE => 65.0, BROWN => 45.0)
2 Likes

Or alternatively you could use an array, then it’s not needed to calculate the hash value of the inputs for each dictionary access:

module ArrayLookup
    @enum COLORS begin
        RED = 1
        GREEN = 2
        BLUE = 3
        BROWN = 4
    end

    const MY_ARRAY = [42.0, 54.0, 65.0, 45.0]

    function lookup(color::COLORS)::Float64
        return MY_ARRAY[Int(color)]
    end

    function doit()::Nothing
        for color::COLORS in instances(COLORS)
            lookup(color)
        end

        return nothing
    end
end

Result:

Enums.doit_global() (using a Base.ImmutableDict)
  7.800 ns (0 allocations: 0 bytes)


ArrayLookup.doit()
  2.900 ns (0 allocations: 0 bytes)

So… there is nobody in favor of using dispatch, like indicated in the snippet in the original question? For me, downsides include

  • Wild growth of functions
  • The functions forming the lookup table may become scattered over the package
  • Performance not the best

Global variables are frowned upon because any code can change them, and changing const container types is even supported. Does PersistentDict solve this somehow?

Both ImmutableDict and PersistentDict are constant value (whereas const x = is constant reference)

Lookup through dispatch can be problematic since it may involve a dynamic dispatch, if the lookup key is not constant at call site.

Does this compare to the use of static inside a C/C++ method or function? Static variables like these are evaluated upon first entry of such a function and persist for the next time the function is executed. They are not const though in the C/C++ sense.