Memory allocations when performing operation on Tuple of structs

Albert_de_montserrat · October 15, 2024, 6:09am

That looks like a valid function to me, but you should test if it eventually gives up on unrolling the loop and ends up allocating. You could alternatively also do the following if you want to avoid explicitly passing Val(length(a.ops)) :

LinearAlgebra.mul!(y::AbstractVector, a::MyStructSum, x::AbstractVector) = LinearAlgebra.mul!(y, a, x, Val(length(a.ops)))
@generated function LinearAlgebra.mul!(y::AbstractVector, a::MyStructSum, x::AbstractVector, ::Val{N}) where N
    quote 
        Base.@nexprs $N i -> begin
            @inline 
            mul!(y, a.ops[i], x)
        end
        return y
    end
end

whatever version you prefer. Base.@nexprs is in charge of the unrolling. For example, Base.@nexprs 3 i -> mul!(y, a.ops[i], x) will literally generate the following code:

mul!(y, a.ops[1], x)
mul!(y, a.ops[2], x)
mul!(y, a.ops[3], x)

So I don’t expect this version to allocate memory; however, how large do you expect the length of the tuple to be? I guess you will see large compilation times if it’s really long (I don’t have an intuition here, you should test it).

As for foreach, it looks like it has a hard-coded manual unrolling up to N=31, and then it loops over the remaining elements of the tuple. You can see it here.

Topic		Replies	Views
Alllocations in a reduction over a Tuple Performance memory-allocation	2	472	April 28, 2022
Variable Length Tuples Without Allocation Internals & Design question	9	871	March 27, 2025
Efficient recursive tuple construction General Usage question , tuple , recursion	19	1195	September 10, 2021
Non-allocating loop over a set of structs Performance	13	1064	June 25, 2019
Tuples of closures Performance tuple , closure	5	615	June 21, 2019

Memory allocations when performing operation on Tuple of structs

Related topics