Sum over tuple slower than sum over array

miguelborrero · January 24, 2025, 12:02pm

Hi there,

I wrote a trivial sum function:

function my_sum(collection)
    s = zero(eltype(collection))
    for i in eachindex(collection)
        s += collection[i]
    end
    return s
end

When I apply it to a tuple of Float64 I get the following:

my_tuple = Tuple(rand(10^7));
@btime my_sum($(my_tuple))
  9.295 ms (0 allocations: 0 bytes)
5.000249938227185e6

which is very similar to when I apply it to an equal length array of the same type:

my_vec = rand(10^7);
@btime my_sum($(my_vec))
  9.296 ms (0 allocations: 0 bytes)
5.000539508945694e6

Actually slightly better with the tuple as expected since the type provides the length explicitly and therefore if anything my intuition is that this would help. However, if I do the same with built-in sum function I get:

@btime sum($(my_tuple))
  9.300 ms (0 allocations: 0 bytes)
5.000249938227185e6

@btime sum($(my_vec))
  1.322 ms (0 allocations: 0 bytes)
5.000539508945564e6

Which is a big improvement when using an array! I know the functions dispatched are different but why this difference? does vectorization depend on using an Array-type object?

Thanks in advance!
Miguel.

stevengj · January 24, 2025, 12:18pm

Tuples of 10^7 elements will be slow. This is not what tuples are for — usually they should have only a handful of elements, and almost never over 100.

Think of a tuple as a small anonymous struct in which you refer to fields by index rather than by name. (A named tuple is even closer to an anonymous struct.) You wouldn’t define a struct with 10^7 fields, I hope?

miguelborrero · January 24, 2025, 12:30pm

I see! I saw tuples as static-size arrays given that (aside from the type name) they have a type label and also a size label and therefore I thought that whenever possible making use of this static property by letting the compiler take it for granted would be weakly better. However, my intuition seems too naive! Thanks a lot for the reply.

stevengj · January 24, 2025, 12:34pm

StaticArrays.jl use tuples under the hood, and you should also use them only for small arrays. Static-size arrays don’t help the compiler when the length is large, and can easily hurt — it’s like you’re trying to compile code that has 10^7 local variables. See, for example:

What's the deal with StaticArrays? - #2 by stevengj

Topic		Replies	Views
Sum of tuples are slow Performance	12	1445	November 17, 2023
Sum with generic implementation is slower than double sum Performance	2	243	May 9, 2023
Why is the NamedTuple slower? When/How would it be faster? Is it still allocated on the stack? Performance	6	206	February 14, 2025
Performance: tuple unpacking General Usage	8	1953	October 5, 2017
Hashing tuples of tuples is seemingly very slow Performance	6	731	December 22, 2020

Sum over tuple slower than sum over array

Related topics