Suggestion: updating the dispatch vs. dict/switch benchmark link in the manual

DanielSussman · September 7, 2025, 6:40pm

I was reading through the (extremely helpful!) “performance tips” part of of the manual. In the “dangers of absuing multiple dispatch…” section there is a link to “run-time benchmarks comparing (1) type dispatch, (2) dictionary lookup, and (3) a “switch” statement.” The discussion at that link is helpful, and I particularly like that the linked part of the specific discussion goes right to a comment that starts “Your best bet is always to benchmark.”

On the other hand, it is probably (?) undesirable that the manual points to (1) a mailing list that isn’t used anymore and (2) a benchmark that involves pre-v1.0 code (e.g., immutable Container1{T} val::T end). I thought that the simplest possible change to the documentation would be to change the relevant sentence from “… can be found on the mailing list” to something like “… can be found on the discourse” (along with updating the relevant links).

With that in mind and in case it’s helpful: below I directly rewrite Tim Holy’s example with current language syntax, and then show the results.

The context here is going to be working through an array of parametric containers that hold different types, performing some operation on each element and combining the results. The struct and function definitions needed for each approach:

The type-dispatch solution

struct Container1{T}
    val::T
end

inc(::Int) = 1
inc(::Float64) = 2
inc(::UInt8) = 3

function evaluate_dispatch(vec)
    s = 0
    for item in vec
        s += inc(item.val)
    end
    s
end

The dictionary solution

struct Container2
    code::Symbol
end

function evaluate_dictionary(vec, dict)
    s = 0
    for item in vec
        s += dict[item.code]
    end
    s
end

The switch solution

function evaluate_switch(vec)
    s = 0
    for item in vec
        if item.code == :Int
            s += 1
        elseif item.code == :Float64
            s += 2
        elseif item.code == :UInt8
            s += 3
        else
            error("Unrecognized code")
        end
    end
    s
end

We’ll profile each of these functions using BenchmarkTools:

Benchmarking code

using BenchmarkTools
function compare_approaches()
    vec1 = [Container1(1), Container1(1.0), Container1(0x01)]
    vec2 = [Container2(:Int), Container2(:Float64), Container2(:UInt8)]
    dict = Dict(:Int=>1, :Float64=>2, :UInt8=>3)

    dispatch_benchmark = @benchmark evaluate_dispatch($vec1)
    dictionary_benchmark = @benchmark evaluate_dictionary($vec2,$dict)
    switch_benchmark = @benchmark evaluate_switch($vec2)
    return (
        dispatch = dispatch_benchmark,
        dictionary = dictionary_benchmark,
        switch = switch_benchmark
    )
end

The bottom line turns out to be that, for this very simple function, the switch statement is the fastest, the dictionary is intermediate, and the dispatch solution is the slowest. The most important performance difference is that the type-dispatch solution requires an allocation, whereas both the dictionary and switch solutions are allocation-free.

Very roughly (on my laptop, using Julia v1.11.5, all the usual caveats), switch is roughly a factor of two faster than the dictionary, and is roughly an order of magnitude faster than the type-dispatch approach. My results from the compare_approaches() function above are here:

dispatch benchmark

BenchmarkTools.Trial: 10000 samples with 993 evaluations per sample.
 Range (min … max):  35.257 ns …  5.888 μs  ┊ GC (min … max): 0.00% … 98.72%
 Time  (median):     43.763 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   52.043 ns ± 65.339 ns  ┊ GC (mean ± σ):  1.27% ±  1.34%

  ▃▆▇████▇▆▄▃▂▂▂▁▂▂▂▁▁            ▁  ▁▁▂▁▁▂▂▃▃▂▂▂             ▃
  ██████████████████████▇██▇▆▆█▇▇██████████████████▇▇▇▅▅▇▅▅▆▅ █
  35.3 ns      Histogram: log(frequency) by time       118 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

dictionary benchmark

BenchmarkTools.Trial: 10000 samples with 999 evaluations per sample.
 Range (min … max):   9.163 ns … 286.502 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):      9.642 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   12.478 ns ±   5.994 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▃▁▃▅▃▂▂▁▁▁▁▁▁▁ ▁▂   ▂ ▁ ▁  ▁       ▄▄▁                      ▁
  ██████████████████▇█▇█▇█▆█▅██▆█▆▆█▇▆████▇▆▅▅███▇▆▅▄▄▄█▄▄▃▄▄▂ █
  9.16 ns       Histogram: log(frequency) by time      27.3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Switch statement benchmark

BenchmarkTools.Trial: 10000 samples with 1000 evaluations per sample.
 Range (min … max):  4.078 ns … 205.657 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     4.356 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.861 ns ±   4.363 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▆▄▂▁▁ ▂▁▁▂ ▂▂▁▂▁▁▁ ▂  ▃▃ ▅ ▃  ▁         ▁                 ▂
  ████████████████████▇█████▇█▇██▇█▇▆█▅▆▁▆▁▁█▄▅▄▁▄▄▄▄▄▄▄▆▅▇▄▅ █
  4.08 ns      Histogram: log(frequency) by time      15.9 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

brianguenter · September 9, 2025, 6:32pm

A third alternative, which is ~1.5x slower than the switch statement but ~2x faster than the dict on my machine. Compiled in Julia 1.11.6.

struct Container3{T<:Union{Int,Float64,UInt8}}
    val::T
end

function evaluate_dispatch3(vec)
    s = 0
    for item in vec
        s += inc(item)
    end
    s
end

inc(::Container3{Int}) = 1
inc(::Container3{Float64}) = 2
inc(::Container3{UInt8}) = 3

function compare_approaches()
    vec1 = [Container1(1), Container1(1.0), Container1(0x01)]
    vec2 = [Container2(:Int), Container2(:Float64), Container2(:UInt8)]
    vec3 = [Container3(1), Container3(1.0), Container3(0x01)]
    dict = Dict(:Int => 1, :Float64 => 2, :UInt8 => 3)

    dispatch_benchmark = @benchmark evaluate_dispatch($vec1)
    dictionary_benchmark = @benchmark evaluate_dictionary($vec2, $dict)
    switch_benchmark = @benchmark evaluate_switch($vec2)
    dispatch3_benchmark = @benchmark evaluate_dispatch3($vec3)
    return (
        dispatch=dispatch_benchmark,
        dictionary=dictionary_benchmark,
        switch=switch_benchmark,
        dispatch3=dispatch3_benchmark,
    )
end

Results:

julia> compare_approaches()
(dispatch = Trial(143.030 ns), dictionary = Trial(14.830 ns), switch = Trial(5.000 ns), dispatch3 = Trial(7.708 ns))

julia> b = ans;

julia> for item in b display(item) end

#dispatch benchmark

BenchmarkTools.Trial: 10000 samples with 825 evaluations.
 Range (min … max):  143.030 ns …  10.840 μs  ┊ GC (min … max): 0.00% … 98.16%
 Time  (median):     168.364 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   171.574 ns ± 107.809 ns  ┊ GC (mean ± σ):  0.62% ±  0.98%

                     ▁▃▁  ▃▇█▃
  ▁▁▁▁▁▁▁▁▁▁▁▁▂▂▃▃▄▄▆███▇▇█████▆▇▇▆▅▅▆█▇▄▄▃▃▃▃▃▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁ ▃
  143 ns           Histogram: frequency by time          201 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

#dictionary benchmark

BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range (min … max):  14.830 ns … 67.635 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     15.932 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   16.106 ns ±  1.482 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▃▂▁▁▅ ▆▄▁▇█▇ ▅▂▆▇▅ ▄▁▃▃▁  ▂▁▂▂▁                            ▂
  ██████▁██████▁█████▁██████▁█████▁█▇▄▆▇▄▁▆▆▄▅▇▁▇▅▆▆▃▃▁▅▅▃▆▅▃ █
  14.8 ns      Histogram: log(frequency) by time      19.8 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

#switch benchmark

BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  5.000 ns … 17.800 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     5.400 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.395 ns ±  0.396 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▅   █
  ▃▁▁▁▄▁▁▁▆▁▁▁▁█▁▁▁█▁▁▁▁▅▁▁▁▃▁▁▁▁▃▁▁▁▂▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▃ ▂
  5 ns           Histogram: frequency by time         6.3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

#Union benchmark

BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min … max):  7.708 ns … 42.943 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     9.710 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.760 ns ±  0.998 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                          ▂▆▃ ▁▂  █▄                          
  ▂▂▂▁▂▂▁▃▃▂▁▃▂▁▃▃▃▁▃▄▁█▆▁███▁██▁▃██▁▃▃▁▄▃▁▃▃▂▁▂▄▁▃▂▂▁▂▂▁▂▂▂ ▃
  7.71 ns        Histogram: frequency by time        11.8 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Topic		Replies	Views
Dispatch time using `Val` is an order of magnitude slower than looking up with `Dict` Internals & Design	4	1070	May 5, 2017
IF...ELSEIF...ELSE performance Performance	16	24899	May 29, 2019
Performance benefits of multiple dispatch Performance question , multidispatch	20	2613	December 19, 2021
Update on single dispatch benchmark comparison to C++ Performance cxx , benchmark	19	3128	February 13, 2021
How do I make a dispatch table using multiple dispatch instead of Dict? General Usage	20	1481	May 9, 2021

Suggestion: updating the dispatch vs. dict/switch benchmark link in the manual

Related topics