Hello! This is somewhat of a followup question from before. I have some structs using StaticArrays for performance reasons, which has been working great until now. It seems that as long as I am broadcasting ( ‘.=’ ) an array to the field, everything seems fine. But when I want to change the value of ‘s’ (a Float64), I run into an error, which I’ve listed below the example code.
using StaticArrays
struct Particle
pos::MVector{3,Float64}
s::Float64
end
P = Particle(MVector(0., 0., 0.), 1.)
P.pos .= [1.,1.,1.] # runs (even with Array not S.A.)
P.s .= 2. # copyto! syntax error
What would you suggest as the most performant way to proceed? I am thinking of either redefining the struct as mutable, or storing each field in a static array. The first should be easier to implement, but I am worried about performance “gotchas”.
I just played around two different solutions, shown below with benchmarks for future reference.
module S
N=30_000_000
using StaticArrays, Random
mutable struct MParticle
pos::SVector{3,Float64}
s::Float64
end
struct SParticle
pos::MVector{3,Float64}
s::MVector{1,Float64}
end
PM = MParticle(SVector(0., 0., 0.), 1.)
PS = SParticle(MVector(0., 0., 0.), MVector(1.))
# Benchmark
@time for i in 1:N
PM.pos = SVector(rand(),rand(),rand())
PM.s = rand()
end
@time for i in 1:N
PS.pos .= MVector(rand(),rand(),rand())
PS.s .= MVector(rand())
end
# module end
end
julia> include("structTest.jl")
4.830332 seconds (270.00 M allocations: 5.812 GiB, 11.33% gc time, 0.17% compilation time)
4.781887 seconds (270.00 M allocations: 5.812 GiB, 11.59% gc time, 0.04% compilation time)
Main.S
I am surprised that both methods seem equally performant for sufficiently large N, although I noticed SParticle had consistently lower % compilation time.
The first structure definition seems cleaner, so I think I will moved forward in implementing the mutable struct with immutable fields.
Don’t benchmark in global scope. Put the thing you want to benchmark in a function and @benchmark that function (from BenchmarkTools.jl). Don’t use global variables during benchmarking (or for otherwise performance critical objects).
MParticle’s advantage is that it will require 1 allocation to construct, and is 1 contiguous block of memory, while SParticle requires 2 allocations to construct, and has 2 separate addresses.
Neither of these differences will be tested in a micro benchmark updating the same object in place.
function update1(PM, N)
@time for i in 1:N
PM.pos = SVector(rand(),rand(),rand())
PM.s = rand()
end
end
function update2(PS, N)
@time for i in 1:N
PS.pos .= MVector(rand(),rand(),rand())
PS.s .= MVector(rand())
end
end
update1(PM, N)
update2(PS, N)
Do you have further reading on this that you can recommend? I am interested for the sake of curiosity, and this behavior is also important to the performance of my simulation at large.