Looping through NamedTuple is slow

z-wang · July 8, 2024, 2:38am

Hi, i was comparing the performance between NamedTuple and Dict, and the results are very confusing.

In the following code, if the elements inside containers are accessed through a loop, the performance of NamedTuple is worse than Dict.

using BenchmarkTools

x = (a=1.0, b=1.0)
y = Dict(:a=>1.0, :b=>1.0)
indx = [:a, :b]

function func1(x, indx)
    for i in indx
        x[i]
    end
end

@btime func1($x, $indx) # 17.034 ns (0 allocations: 0 bytes)
@btime func1($y, $indx) # 9.300 ns (0 allocations: 0 bytes)

However, if we don’t use loop, the NamedTuple will have a huge performance gain. Could someone help me understand why this happens?

function func2(x)
    x[:a]
    x[:b]
end

@btime func2($x) # 2.700 ns (0 allocations: 0 bytes)
@btime func2($y) # 7.800 ns (0 allocations: 0 bytes)

Per · July 8, 2024, 4:16am

The difference is that in func2, the index is known at compile-time, but in func1 it is not.

z-wang · July 8, 2024, 4:22am

Thank you! I’m curious why it matters much more for NamedTuple compared to Dict. The performance difference in func1 and func2 is quite minimal for the latter, but huge for NamedTuple.

Per · July 8, 2024, 4:31am

My guess is that Dict is optimized for the case where indicies are not known at compile-time, but NamedTuple is optimized for the case where they are. So you can pick the one that suits your case.

Per · July 8, 2024, 4:41am

In func2(::NamedTuple) the compiler is probably able to figure out that x[:a] is never used, and the look-up has no side effects, so the function can directly return x[:b]. This makes the the difference seem bigger than it actually is.

DNF · July 8, 2024, 5:57am

When benchmarking, you should avoid situations where the compiler can “optimize away” your code. I’m not sure exactly what happens in this case, since I cannot run your code now, but it’s better to do something like

function func1(x, indx)
    s = 0.0  # or zero(eltype(x)) 
    for i in indx
        s += x[i]
    end
    return s  # important, return something observable
end

Then you force the function to do actual work.

Also

function func2(x)
    return x[:a] + x[:b]
end

z-wang · July 8, 2024, 8:41pm

Thanks! I redo the benchmark but still have the same results.

z-wang · July 8, 2024, 8:47pm

But even if the indexes are explicitly stated in func1, the results still don’t change.

function func1(x)
    s = 0.0 
    for i in (:a, :b)
        s += x[i]
    end
    return s
end

@btime func1($x) # 16.232 ns (0 allocations: 0 bytes)

Topic		Replies	Views
Why is the NamedTuple slower? When/How would it be faster? Is it still allocated on the stack? Performance	6	250	February 14, 2025
Performance of NamedArrays vs Dictionaries of Tuples General Usage	2	1853	August 28, 2020
NamedTuple vs Field General Usage	9	4196	January 21, 2018
Performance of Dict Performance	1	958	November 12, 2019
Type Inference of many dynamically created NamedTuples Performance question , compilation , type-stability	6	96	July 25, 2024

Looping through NamedTuple is slow

Related topics