I was thinking of using the package StaticArrays.jl to enhance the performance of my code. However, I only use arrays to store computed variables and use them later after certain conditions are set. Hence, I was benchmarking the type SizedVector in comparison with normal vector, but I do not understand to code below. I also tried StaticVector and used the work around Setfield.jl.
using StaticArrays, BenchmarkTools, Setfield
function copySized(n::Int64)
v = SizedVector{n, Int64}(zeros(n))
w = Vector{Int64}(undef, n)
for i in eachindex(v)
v[i] = i
end
for i in eachindex(v)
w[i] = v[i]
end
end
function copyStatic(n::Int64)
v = @SVector zeros(n)
w = Vector{Int64}(undef, n)
for i in eachindex(v)
@set v[i] = i
end
for i in eachindex(v)
w[i] = v[i]
end
end
function copynormal(n::Int64)
v = zeros(n)
w = Vector{Int64}(undef, n)
for i in eachindex(v)
v[i] = i
end
for i in eachindex(v)
w[i] = v[i]
end
end
n = 10
@btime copySized($n)
@btime copyStatic($n)
@btime copynormal($n)
3.950 μs (42 allocations: 2.08 KiB)
5.417 μs (98 allocations: 4.64 KiB)
78.822 ns (2 allocations: 288 bytes)
Why does the case with SizedVector does have some much more allocations and hence worse performance? Do I not use SizedVector correctly? Should it not at least have the same performance as normal arrays?
Just a guess, but the loop contents are simple enough that the compiler is probably optimizing the loop away in the latter case. The Vector version is benefitting from that, but the non-builtin types can’t.
As soon as the loop has some non-trivial code that the compiler can’t just optimize away, the Vector version will start actually looping and then allow you measure the actual performance difference.
Statically sized arrays have their performance advantages when the size is known statically, at compile time. Your arrays are, however, dynamically sized, with the size n being a runtime variable.
using StaticArrays, BenchmarkTools, Setfield
function copySized()
v = SizedVector{10, Float64}(zeros(10))
w = Vector{Float64}(undef, 10*2)
for i in eachindex(v)
v[i] = rand()
end
for i in eachindex(v)
j = i+floor(Int64, 10/4)
w[j] = v[i]
end
end
function copyStatic()
v = @SVector zeros(10)
w = Vector{Int64}(undef, 10*2)
for i in eachindex(v)
@set v[i] = rand()
end
for i in eachindex(v)
j = i+floor(Int64, 10/4)
w[j] = v[i]
end
end
function copynormal()
v = zeros(10)
w = Vector{Float64}(undef, 10*2)
for i in eachindex(v)
v[i] = rand()
end
for i in eachindex(v)
j = i+floor(Int64, 10/4)
w[j] = v[i]
end
end
@btime copySized()
@btime copyStatic()
@btime copynormal()
110.162 ns (3 allocations: 512 bytes)
48.133 ns (1 allocation: 224 bytes)
92.045 ns (2 allocations: 368 bytes)
function copyStaticStatic(::Val{N}) where {N}
v = @SVector zeros(N)
w = Vector{Int64}(undef, N)
for i in eachindex(v)
@set v[i] = i
end
for i in eachindex(v)
w[i] = v[i]
end
end
1.7.0> @btime copyStaticStatic(Val(10))
27.108 ns (1 allocation: 144 bytes)