Why is the NamedTuple slower? When/How would it be faster? Is it still allocated on the stack?

In the following case, the Dict beats the NamedTuple in the creation and in the lookup? Why is this the case? Am I missing something?

using BenchmarkTools

tup1 = NamedTuple(k => v for (k, v) in [(:a, 5), (:b, 10)])
tup2 = Dict(k => v for (k, v) in [(:a, 5), (:b, 10)])

@btime tup1 = NamedTuple(k => v for (k, v) in [(:a, 5), (:b, 10)]) # -> 830.962 ns (13 allocations: 864 bytes)
@btime tup2 = Dict(k => v for (k, v) in [(:a, 5), (:b, 10)]) # -> 90.508 ns (6 allocations: 544 bytes)

@btime tup1.b # -> 31.407 ns (0 allocations: 0 bytes)
@btime tup2[:b] # -> 16.699 ns (0 allocations: 0 bytes)

I assume the tuple is still in the stack? When would it be faster?

For tuples the amount of elements is part of the type. So if you create a tuple for which the amount elements is not known at compile time (e.g. when you create a tuple from a vector), that will be slow. In the case of a NamedTuple the field names are also part of the types, and apparently they cannot be inferred at compile time here.

The dictionary is fine because it can be inferred that the source vector is of type Vector{Tuple{Symbol, Int64}} and thus that the dictionary is of type Dict{Symbol, Int64}.

1 Like

I kind of expected NamedTuple(k => v for (k, v) in ((:a, 5), (:b, 10))) to be fast (i.e. generating the named tuple from a tuple of tuples), but it isn’t either.

Maybe someone can comment on the relationship between type instability and heap allocations?

This is slow mostly because of the method being hit. If you had instead done

julia> @btime NamedTuple{(:a, :b)}((5, 10))
  1.082 ns (0 allocations: 0 bytes)
(a = 5, b = 10)

you’d get it being “instant”, because in this case there’s literally nothing to do, the whole thing can be done at compile time. This only works in cases where you know the keys at compile time though (if you don’t, then NamedTuples probably aren’t a good fit for your program).

This one mostly just comes down to you measuring untyped global variables. If you benchmark without the globals, you’ll see the named tuple is actually faster to access (when the key being accessed is known at compile time):

julia> @btime tup.b setup=(tup = $tup1)
  1.943 ns (0 allocations: 0 bytes)
10

julia> @btime tup[:b] setup=(tup = $tup2)
  4.098 ns (0 allocations: 0 bytes)
10
5 Likes

You are doing the benchmarking in global scope. If you want to get realistic values, you either need to move it in local scope or use const for tup1 and tup2. Then accessing the NamedTuple is indeed faster.

I am not sure whether it helps, but the creation is fast when using this syntax (a = 5, b = 10).

Thank you guys very much! I am sorry for benchmarking in the global scope, I should’ve realized. I switched the code in my actual use case as in the example and it’s a little faster :slight_smile: I am looking up around 50 times more often than creating, which is surely a big reason. But I also notice, that the script uses less RAM.
I cannot benchmark reliably exactly, as I am working with a genetic algorithm. As for predefining the keys, I do not want to do that, as they are supposed to be user-definable (optimization objectives).