Constructing SVector with a loop

Tamas_Papp · September 23, 2018, 11:00am

Suppose I want to construct an SVector with known length S and type T, but a loop (with a known number of iterations) would be the most comfortable way of doing it. What is the standard idiom for this in 1.0?

My problem is a bit complex, but for an example, consider

using StaticArrays

function SFib1(::Val{S}) where S
    values = Vector{Int}(undef, S)
    @inbounds for i in 1:S
        values[i] = (i == 1 || i == 2) ? 1 : values[i-1] + values[i-2]
    end
    SVector{S}(values)
end

Is it possible to allocate the SVector, then manipulate it with reinterpret or similar before returning it?

Gnimuc · September 23, 2018, 11:30am

It looks like you’re looking for MVector?

Tamas_Papp · September 23, 2018, 11:39am

I am not sure. Should I construct an MVector, then convert to an SVector? It appears to be slower than using a Vector, eg

function SFibM(::Val{S}) where S
    values = MVector{S, Int}(undef)
    @inbounds for i in 1:S
        values[i] = (i == 1 || i == 2) ? 1 : values[i-1] + values[i-2]
    end
    SVector{S}(values)
end

julia> @btime SFib1(Val(10));
  40.800 ns (1 allocation: 160 bytes)

julia> @btime SFibM(Val(10));
  65.488 ns (0 allocations: 0 bytes)

Gnimuc · September 23, 2018, 12:01pm

It seems the crossover point is 7.

julia> @btime SFibM(Val(7))
  31.316 ns (0 allocations: 0 bytes)

julia> @btime SFib1(Val(7))
  48.441 ns (1 allocation: 144 bytes)

without the additional conversion, I get:

julia> @btime SFibM2(Val(7)) 
  10.438 ns (1 allocation: 64 bytes)

tkoolen · September 23, 2018, 3:01pm

Probably not a valid solution, but fun anyway:

function SFibNtuple(::Val{S}) where S
    val_i_minus_2 = Ref(1)
    val_i_minus_1 = Ref(1)
    tup = ntuple(S) do i
        if i == 1 || i == 2
            1
        else
            val = val_i_minus_2[] + val_i_minus_1[]
            val_i_minus_2[] = val_i_minus_1[]
            val_i_minus_1[] = val
        end
    end
    SVector(tup)
end

This infers up to Val(10).

using BenchmarkTools
@btime SFibNtuple(Val(10));

prints 1.702 ns (0 allocations: 0 bytes).

mohamed82008 · September 23, 2018, 3:06pm

You could use a generated function to unroll the loop.

tkf · September 23, 2018, 3:32pm

Setfield.@set is a great tool for manipulating StaticArrays efficiently

using Setfield
using StaticArrays

function SFib1(::Val{S}) where S
    values = zeros(SVector{S, Int})
    @inbounds for i in 1:S
        values = @set values[i] = (i == 1 || i == 2) ? 1 : values[i-1] + values[i-2]
    end
    values
end

julia> @code_llvm SFib1(Val(4))

; Function SFib1
; Location: REPL[132]:2
define void @julia_SFib1_38655({ [4 x i64] }* noalias nocapture sret) {
L45.3:
; Location: REPL[132]:4
; Function macro expansion; {
; Location: /home/takafumi/.julia/packages/Setfield/X9IQb/src/sugar.jl:77
  %1 = bitcast { [4 x i64] }* %0 to <4 x i64>*
  store <4 x i64> <i64 1, i64 1, i64 2, i64 3>, <4 x i64>* %1, align 8
  ret void
;}
}

mschauer · September 23, 2018, 4:36pm

Neither a loop nor necessarily a good idea, but here is this:

function raise_(N) 
    if N == 0 
        :(args...)
    elseif N == 1   
        :(f(args...)) 
    else
        expr = raise_(N-1)
        return :(f($expr))
    end
end

@generated function raise(f, ::Val{N}, args...) where N
    raise_(N)
end

using StaticArrays, BenchmarkTools
fib(_) = (1, 1)
fib(t::NTuple) = (t..., t[end] + t[end-1])
fibs(::Val{N}) where N = SVector(raise(fib, Val(N-1), nothing))

; Function fibs
; Location: /Users/schauermr/.julia/v0.6/Bridge/project/ntuple.jl:20
define void @julia_fibs_38351({ [4 x i64] }* noalias nocapture sret) {
top:
  %1 = bitcast { [4 x i64] }* %0 to <4 x i64>*
  store <4 x i64> <i64 1, i64 1, i64 2, i64 3>, <4 x i64>* %1, align 8
  ret void
}

user664303 · May 19, 2025, 4:20pm

Is there a better solution to this, now?

It seems that SVectors (and SMatrices and SArrays), while being fast, stack-based, fixed-size containers (and indeed my go-to container for short, fixed sized arrays), are complicated to construct if the computation is non trivial. Unless I’m missing something.

Mason · May 19, 2025, 4:34pm

I would use Accessors.jl for it:

julia> using StaticArrays, Accessors

julia> function SFib2(::Val{S}) where S
           values = SVector(ntuple(_ -> 0, Val(S)))
           @inbounds for i in 1:S
               @reset values[i] = (i == 1 || i == 2) ? 1 : values[i-1] + values[i-2]
           end
           values
       end

Accessors.jl is the successor to Setfield.jl (the already accepted answer).

Otherwise, I would use an MVector and convert to SVector at the end.

Sukera · May 19, 2025, 4:41pm

For Julia 1.11+, you could try a Memory:

julia> function memfib(::Val{S}) where S
    values = Memory{Int}(undef, S)
    @inbounds for i in 1:S
        values[i] = ((i == 1) | (i == 2)) ? 1 : values[i-1] + values[i-2]
    end
    values
end

julia> @btime memfib(Val(7))
  4.010 ns (1 allocation: 80 bytes)
7-element Memory{Int64}:
  1
  1
  2
  3
  5
  8
 13

This stays pretty fast too.

mikmoore · May 19, 2025, 4:42pm

In v1.13 (but not v1.11, v1.12 I haven’t checked) I have found that if I construct or convert to a MArray initially but convert to a SArray by the end of the function (and don’t allow the MArray to escape) then it will run without heap allocations and quite quickly. This has let me apply functions like LinearAlgebra.lowrankupdate! (which requires mutability) to a SMatrix heap-free and without needing to write a non-mutating version of the function.

So the MVector approach may not be fantastic now but might be soon™.

Mason · May 19, 2025, 4:45pm

The Accessors.jl solution doesn’t even have a runtime though, it’s all at compile time here which is nice. Unfortunately Memory is still too opaque to the compiler to be eliminated here.

matthias314 · May 19, 2025, 7:08pm

If that’s the goal, couldn’t it simply be done with @assume_effects?

Base.@assume_effects :total function SFibA(::Val{S}) where S
    values = MVector{S, Int}(undef)
    @inbounds for i in 1:S
        values[i] = (i == 1 || i == 2) ? 1 : values[i-1] + values[i-2]
    end
    SVector{S}(values)
end

;  @ In[13]:32 within `SFibA`
; ┌ @ /usr/local/julia-depot/packages/StaticArrays/LSPcF/src/convert.jl:180 within `StaticArray`
; │┌ @ /usr/local/julia-depot/packages/StaticArraysCore/7xxEJ/src/StaticArraysCore.jl:115 within `SArray`
    store <4 x i64> <i64 1, i64 1, i64 2, i64 3>, ptr %sret_return, align 8
    ret void
; └└

(One can probably reduce :total to something more modest.)

Oscar_Smith · May 19, 2025, 7:14pm

On 1.12, no effects are needed:

julia> function SFibA(::Val{S}) where S
           values = MVector{S, Int}(undef)
           @inbounds for i in 1:S
               values[i] = (i == 1 || i == 2) ? 1 : values[i-1] + values[i-2]
           end
           SVector{S}(values)
       end
SFibA (generic function with 1 method)

julia> @code_llvm debuginfo=:none SFibA(Val(4))
; Function Signature: SFibA(Base.Val{4})
define void @julia_SFibA_5700(ptr noalias nocapture noundef nonnull sret([1 x [4 x i64]]) align 8 dereferenceable(32) %sret_return) #0 {
L61.3:
  store <4 x i64> <i64 1, i64 1, i64 2, i64 3>, ptr %sret_return, align 8
  ret void
}

Looks like it works up to size 12:

julia> @code_llvm debuginfo=:none SFibA(Val(12))
; Function Signature: SFibA(Base.Val{12})
define void @julia_SFibA_5928(ptr noalias nocapture noundef nonnull sret([1 x [12 x i64]]) align 8 dereferenceable(96) %sret_return) #0 {
L61.11:
  store <4 x i64> <i64 1, i64 1, i64 2, i64 3>, ptr %sret_return, align 8
  %"new::SArray.sroa.0.sroa.5.0.sret_return.sroa_idx" = getelementptr inbounds i8, ptr %sret_return, i64 32
  store <4 x i64> <i64 5, i64 8, i64 13, i64 21>, ptr %"new::SArray.sroa.0.sroa.5.0.sret_return.sroa_idx", align 8
  %"new::SArray.sroa.0.sroa.9.0.sret_return.sroa_idx" = getelementptr inbounds i8, ptr %sret_return, i64 64
  store <4 x i64> <i64 34, i64 55, i64 89, i64 144>, ptr %"new::SArray.sroa.0.sroa.9.0.sret_return.sroa_idx", align 8
  ret void
}

Sukera · May 19, 2025, 7:51pm

Well, the Memory here will never be eliminated - it is returned after all, just like the Vector or SVector before it

Mason · May 19, 2025, 10:22pm

Oh my bad, I didn’t read the code very carefully I guess, and just assumed you were still converting to an SVector since otherwise it wouldn’t really fit with the original question..

danielwe · May 20, 2025, 1:21am

Another compile-time version using only ntuple, no packages:

julia> function fibntuple(::Val{S}) where S
           fib_curr::Int = 0
           fib_next::Int = 1
           return ntuple(Val(S)) do i
               @inline
               fib_curr, fib_next = fib_next, fib_curr + fib_next
               return fib_curr
           end
       end
fibntuple (generic function with 1 method)

julia> Sfib(::Val{S}) where {S} = SVector(fibntuple(Val(S)))
Sfib (generic function with 1 method)

julia> @btime Sfib(Val(32));
  0.584 ns (0 allocations: 0 bytes)

julia> @code_llvm debuginfo=:none Sfib(Val(32));
define void @julia_Sfib_1137([1 x [32 x i64]]* noalias nocapture noundef nonnull sret([1 x [32 x i64]]) align 8 dereferenceable(256) %0) #0 {
top:
  %1 = bitcast [1 x [32 x i64]]* %0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(256) %1, i8* noundef nonnull align 8 dereferenceable(256) bitcast ([1 x [32 x i64]]* @_j_const1 to i8*), i64 256, i1 false)
  ret void
}

Note that, in contrast to Constructing SVector with a loop - #15 by Oscar_Smith, this doesn’t even have the SIMD instructions to populate the vector with hardcoded scalars. It’s just directly returning a stored constant. (In other words, there’s two levels of “moving the computation to compile time” here; one is computing all the values at compile time but assembling them into a vector at runtime, which I’ll call SIMD-style; another is doing nothing at all at runtime.) This works up to S = 32, beyond which fibntuple falls back SIMD-style and is no longer inlined into Sfib.

This behavior is quite particular, though. If you try to combine fibntuple and Sfib into a single function, and/or use Refs instead of the typed (but boxed) locals, such as in Constructing SVector with a loop - #5 by tkoolen, you get SIMD-style LLVM again.

Topic		Replies	Views
Fastest way to convert an MVector/NTuple to a different sized SVector/NTuple? Performance staticarrays	8	431	February 27, 2023
Usage of arrays of static arrays New to Julia performance , staticarrays	16	1102	February 22, 2023
Creating a Static Array with a loop General Usage	10	1004	April 12, 2022
Creating SVector of a single SVector{1, Float64} New to Julia question , staticarrays	8	499	April 4, 2023
Hi! Suppose I want to code a function that returns a `SVector` but each component General Usage	9	444	February 2, 2021

Constructing SVector with a loop

Related topics