Performance assigning and copying with StaticArrays.jl

I was thinking of using the package StaticArrays.jl to enhance the performance of my code. However, I only use arrays to store computed variables and use them later after certain conditions are set. Hence, I was benchmarking the type SizedVector in comparison with normal vector, but I do not understand to code below. I also tried StaticVector and used the work around Setfield.jl.

using StaticArrays, BenchmarkTools, Setfield
function copySized(n::Int64)
    v = SizedVector{n, Int64}(zeros(n))
    w = Vector{Int64}(undef, n)
    for i in eachindex(v)
        v[i] = i
    end
    for i in eachindex(v)
        w[i] = v[i]
    end
end
function copyStatic(n::Int64)
    v = @SVector zeros(n)
    w = Vector{Int64}(undef, n)
    for i in eachindex(v)
        @set v[i] = i
    end
    for i in eachindex(v)
        w[i] = v[i]
    end
end
function copynormal(n::Int64)
    v = zeros(n)
    w = Vector{Int64}(undef, n)
    for i in eachindex(v)
        v[i] = i
    end
    for i in eachindex(v)
        w[i] = v[i]
    end
end
n = 10
@btime copySized($n)
@btime copyStatic($n)
@btime copynormal($n)

3.950 μs (42 allocations: 2.08 KiB)
5.417 μs (98 allocations: 4.64 KiB) 
78.822 ns (2 allocations: 288 bytes)

Why does the case with SizedVector does have some much more allocations and hence worse performance? Do I not use SizedVector correctly? Should it not at least have the same performance as normal arrays?

Thank you in advance.

Just a guess, but the loop contents are simple enough that the compiler is probably optimizing the loop away in the latter case. The Vector version is benefitting from that, but the non-builtin types can’t.

As soon as the loop has some non-trivial code that the compiler can’t just optimize away, the Vector version will start actually looping and then allow you measure the actual performance difference.

1 Like

Statically sized arrays have their performance advantages when the size is known statically, at compile time. Your arrays are, however, dynamically sized, with the size n being a runtime variable.

3 Likes

You are right! I overlooked this. Thanks!

using StaticArrays, BenchmarkTools, Setfield
function copySized()
    v = SizedVector{10, Float64}(zeros(10))
    w = Vector{Float64}(undef, 10*2)
    for i in eachindex(v)
        v[i] = rand()
    end
    for i in eachindex(v)
        j = i+floor(Int64, 10/4)
        w[j] = v[i]
    end
end
function copyStatic()
    v = @SVector zeros(10)
    w = Vector{Int64}(undef, 10*2)
    for i in eachindex(v)
       @set v[i] = rand()
    end
    for i in eachindex(v)
        j = i+floor(Int64, 10/4)
        w[j] = v[i]
    end
end
function copynormal()
    v = zeros(10)
    w = Vector{Float64}(undef, 10*2)
    for i in eachindex(v)
        v[i] = rand()
    end
    for i in eachindex(v)
        j = i+floor(Int64, 10/4)
        w[j] = v[i]
    end
end
@btime copySized()
@btime copyStatic()
@btime copynormal()

 110.162 ns (3 allocations: 512 bytes)
 48.133 ns (1 allocation: 224 bytes)
 92.045 ns (2 allocations: 368 bytes)

Here’s a version that is statically sized:

function copyStaticStatic(::Val{N}) where {N}
    v = @SVector zeros(N)
    w = Vector{Int64}(undef, N)
    for i in eachindex(v)
        @set v[i] = i
    end
    for i in eachindex(v)
        w[i] = v[i]
    end
end

1.7.0> @btime copyStaticStatic(Val(10))
  27.108 ns (1 allocation: 144 bytes)

Edit: Ah, too late :slight_smile: