Get NamedTuple element with String key. Why is this slow?

Hi there,

I’m attempting to get a value from a NamedTuple using a String key, ideally using x[k] syntax. In the example below, why is x["a"] so slow compared to x[Symbol("a")]?

using BenchmarkTools
import Base: getindex

getindex(x::NamedTuple, colname::String) = x[Symbol(colname)]

x = (a=1.1, b=2.2, c=3.3)

@benchmark $x[:a]   # 1.4ns, native to Julia
@benchmark $x[Symbol("a")]  # 1.5ns
@benchmark $x["a"]  # 63ns
1 Like

I think there is probably a problem with your benchmarking:

julia> using BenchmarkTools

julia> import Base: getindex

julia> getindex(x::NamedTuple, colname::String) = x[Symbol(colname)]
getindex (generic function with 207 methods)

julia> x = (a=1.1, b=2.2, c=3.3)
(a = 1.1, b = 2.2, c = 3.3)

julia> function test(x, s)
           ss = Symbol(s)
           display(@benchmark getindex($x, $ss))
           display(@benchmark getindex($x, Symbol($s)))
           display(@benchmark getindex($x, $s))
test (generic function with 1 method)

julia> test(x, "a")
BenchmarkTools.Trial: 10000 samples with 996 evaluations.
 Range (min … max):  23.098 ns …  1.563 ΞΌs  β”Š GC (min … max): 0.00% … 97.10%
 Time  (median):     24.276 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   26.685 ns Β± 42.309 ns  β”Š GC (mean Β± Οƒ):  4.97% Β±  3.10%

  β–‚β–‡β–ˆβ–ˆβ–‡β–…β–‚    ▁▁▁             ▂▂▃▂▁                            β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–‡β–…β–†β–†β–„β–„β–…β–†β–„β–†β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–†β–‡β–†β–‡β–‡β–ˆβ–‡β–…β–†β–‡β–‡β–†β–„β–…β–β–„β–„β–†β–†β–†β–…β–„β–„β–ƒβ–ƒβ–„β–… β–ˆ
  23.1 ns      Histogram: log(frequency) by time      43.4 ns <

 Memory estimate: 48 bytes, allocs estimate: 2.
BenchmarkTools.Trial: 10000 samples with 990 evaluations.
 Range (min … max):  43.611 ns …  1.448 ΞΌs  β”Š GC (min … max): 0.00% … 95.98%
 Time  (median):     45.656 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   48.419 ns Β± 43.207 ns  β”Š GC (mean Β± Οƒ):  2.79% Β±  3.04%

  β–β–…β–‡β–ˆβ–ˆβ–‡β–†β–…β–ƒ   ▁▂▁▁         ▁▂▃▃▃▁                             β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–†β–…β–…β–…β–…β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–ˆβ–ˆβ–ˆβ–‡β–…β–†β–†β–†β–„β–ƒβ–…β–…β–…β–†β–…β–„β–…β–ƒβ–β–ƒβ–„β–…β–…β–…β–…β–… β–ˆ
  43.6 ns      Histogram: log(frequency) by time      70.4 ns <

 Memory estimate: 48 bytes, allocs estimate: 2.
BenchmarkTools.Trial: 10000 samples with 990 evaluations.
 Range (min … max):  43.836 ns …  1.989 ΞΌs  β”Š GC (min … max): 0.00% … 96.95%
 Time  (median):     45.779 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   48.997 ns Β± 45.093 ns  β”Š GC (mean Β± Οƒ):  2.85% Β±  3.05%

  β–ƒβ–ˆβ–ˆβ–‡β–†β–ƒ   ▁▁    ▁▁▂▂▁                                        β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–†β–†β–…β–…β–„β–†β–†β–„β–ƒβ–‚β–ƒβ–„β–…β–„β–…β–…β–†β–„β–„β–‚β–„β–ƒβ–‚β–ƒβ–ƒβ–„β–„β–†β–…β–„β–ƒβ–…β–„β–„β–„ β–ˆ
  43.8 ns      Histogram: log(frequency) by time      86.4 ns <

 Memory estimate: 48 bytes, allocs estimate: 2.

For me, the manual conversion and the new method for getindex are basically the same time, which is about the double of using a symbol directly. Note that my benchmarking probably avoids constant propagation, so the very fast results that you are seeing in your benchmark are probably the fact the compiler saw the :a was there, and then executed the computation during compilation.

1 Like

Everything involved in these benchmarks, except for the last example, is fully inferrable by the compiler at compile time.

The first example is a literal symbol indexing into a β€œpasted in” literal named tuple - the compiler sees that only one specific element is needed and neither the tuple nor the key is used anywhere after that, so probably just returns the element in question instead.

The second example is the same, except that the compiler constant props the string through the Symbol constructor, again taking advantage of the fact that this kind of operation (creating symbols from literal string constants) is common and thus seeing the result at compile time.

The third example probably hits the only dynamic path - the compiler only knows that there’s a string, but not it’s content (which would require a symbol), so it has to fall back to the most generic version permitting strings.

This also explains why in the benchmark by @Henrique_Becker the second version has the same speed as the last version - the string is no longer a compile time constant and so the symbol has to be constructed at runtime.

1 Like

Thanks both, that makes perfect sense.