Fastest way to initialize StaticArray of dynamic size with state

taotree · February 25, 2022, 7:04pm

I’m constructing an SVector, but each value depends on the previous calculated value. This is still simple to do with sacollect, but then it has to box the input and so allocates (see example below). What is the optimal approach for this situation?

using StaticArrays
testCalc(i, x) = (i+x, x+1)
function test(::Val{K}, x::Int)::SVector{K} where K
    StaticArrays.sacollect(SVector{K,Int},
        ((i, x) = testCalc(K, x); i) for k in K:-1:1
    )
end
@code_warntype test(Val(4), 1) # output includes x@_5::Union{Int64, Core.Box}

lmiq · February 25, 2022, 7:23pm

Something like this works (and is quite explicit):

julia> using Setfield
       function test2(::Val{K},x::Int) where K
           y = zero(SVector{K,Int})
           @set! y[1] = K
           for i in 2:K
               @set! y[i] = y[i-1] + 1
           end
           return y
       end
test2 (generic function with 1 method)

julia> @btime test2(Val(4),1)
  3.524 ns (0 allocations: 0 bytes)
4-element SVector{4, Int64} with indices SOneTo(4):
 4
 5
 6
 7

DNF · February 25, 2022, 9:13pm

In some cases you can use an MVector, perform your operations, and convert to SVector. If the array doesn’t escape, everything is stack allocated and fast:

function test3(::Val{K},x::Int) where K
    y = MVector{K,Int}(undef)
    y[1] = K
    for i in 2:K
        y[i] = y[i-1] + 1
    end
    return SVector(y)
end

Comparing to the code using Setfield:

1.7.2> @btime test2(Val(4),1)
  3.200 ns (0 allocations: 0 bytes);

1.7.2> @btime test3(Val(4),1);
  2.300 ns (0 allocations: 0 bytes)

The difference grows larger for bigger arrays:

1.7.2> @btime test2(Val(8),1);
  11.011 ns (0 allocations: 0 bytes)

1.7.2> @btime test3(Val(8),1);
  2.600 ns (0 allocations: 0 bytes)

1.7.2> @btime test2(Val(16),1);
  37.538 ns (0 allocations: 0 bytes)

1.7.2> @btime test3(Val(16),1);
  3.100 ns (0 allocations: 0 bytes)

1.7.2> @btime test2(Val(32),1);
  110.645 ns (0 allocations: 0 bytes)

1.7.2> @btime test3(Val(32),1);
  21.486 ns (0 allocations: 0 bytes)

1.7.2> @btime test2(Val(64),1);
  1.300 μs (0 allocations: 0 bytes)

1.7.2> @btime test3(Val(64),1);
  38.547 ns (0 allocations: 0 bytes)

taotree · February 26, 2022, 12:21am

It’s great to see that the compiler can optimize away that allocation. This one seems the best conceptually to me and it’s great to see it also performs the best (out of the ones presented here). As for cases where it wouldn’t work… it seems to me that you should always be able to extract what is inside the loop such that this becomes a util/constructor function:

function makeSV(f, ctx, init, ::Val{K}) where K
    y = MVector{K,Int}(undef)
    next = init
    for i in 1:K
        val, next = f(ctx, next, i)
        y[i] = val
    end
    return SVector(y)
end
function inner(ctx, prev, i)
    (prev, prev+1)
end

makeSV(inner, nothing, 8, Val(8))

Interesting. That appears to run exactly as fast as the specific case @DNF proposed if it’s modified to match (it’s slower if unmodified because it does more):

function testing(::Val{K},x::Int) where K
    y = MVector{K,Int}(undef)
    for i in 1:K
        y[i] = x
        x += 1
    end
    return SVector(y)
end

Thanks for your help! Now I have a very flexible and fast function to create an SVector.

Topic		Replies	Views
Fastest way to convert an MVector/NTuple to a different sized SVector/NTuple? Performance staticarrays	8	431	February 27, 2023
Constructing SVector with a loop General Usage question	17	1023	May 20, 2025
How to construct StaticArray? New to Julia question , staticarrays	6	1575	February 17, 2023
How to initialize an empty StaticArray? General Usage	15	2576	October 18, 2021
Runtime creation of SArray New to Julia	2	316	June 4, 2021

Fastest way to initialize StaticArray of dynamic size with state

Related topics