Hi, i was comparing the performance between NamedTuple and Dict, and the results are very confusing.
In the following code, if the elements inside containers are accessed through a loop, the performance of NamedTuple is worse than Dict.
using BenchmarkTools
x = (a=1.0, b=1.0)
y = Dict(:a=>1.0, :b=>1.0)
indx = [:a, :b]
function func1(x, indx)
for i in indx
x[i]
end
end
@btime func1($x, $indx) # 17.034 ns (0 allocations: 0 bytes)
@btime func1($y, $indx) # 9.300 ns (0 allocations: 0 bytes)
However, if we don’t use loop, the NamedTuple will have a huge performance gain. Could someone help me understand why this happens?
Thank you! I’m curious why it matters much more for NamedTuple compared to Dict. The performance difference in func1 and func2 is quite minimal for the latter, but huge for NamedTuple.
My guess is that Dict is optimized for the case where indicies are not known at compile-time, but NamedTuple is optimized for the case where they are. So you can pick the one that suits your case.
In func2(::NamedTuple) the compiler is probably able to figure out that x[:a] is never used, and the look-up has no side effects, so the function can directly return x[:b]. This makes the the difference seem bigger than it actually is.
When benchmarking, you should avoid situations where the compiler can “optimize away” your code. I’m not sure exactly what happens in this case, since I cannot run your code now, but it’s better to do something like
function func1(x, indx)
s = 0.0 # or zero(eltype(x))
for i in indx
s += x[i]
end
return s # important, return something observable
end