How to create an SVector of SVectors part II

I have the following code:

using BenchmarkTools, StaticArrays

if ! @isdefined SEGMENTS
    const SEGMENTS  = 6
end

# Type definitions
const SimFloat = Float64
const Vec3     = MVector{3, SimFloat}

function create_state()
    pos =  zeros(SVector{SEGMENTS+1, Vec3})
    pos[1] .= [1.0,2,3]
    vel =  zeros(SVector{SEGMENTS+1, Vec3})
    y = reduce(vcat, vcat(pos, vel))
end

function unpack(y)
    part  = reshape(y,  Size(3, SEGMENTS+1, 2))
    pos1 = part[:,:,1]
    pos2 = [SVector(pos1[:,i]) for i in 1:SEGMENTS+1]
end

state = create_state()
display(unpack(state))

display(@benchmark unpack(y) setup = (y = create_state()))

It has the following output:

julia> include("src/test.jl")
7-element Vector{SVector{3, Float64}}:
 [1.0, 2.0, 3.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
BenchmarkTools.Trial: 
  memory estimate:  432 bytes
  allocs estimate:  2
  --------------
  minimum time:     52.806 ns (0.00% GC)
  median time:      55.680 ns (0.00% GC)
  mean time:        59.781 ns (4.43% GC)
  maximum time:     541.337 ns (77.31% GC)
  --------------
  samples:          10000
  evals/sample:     985

There are two things I want to improve:

  1. the function unpack should return an SVector of SVectors. How can I achieve that?
  2. the function unpack shall NOT allocate

Any ideas?

I am using HybridArrays.jl for similar tasks and it works really well. Maybe that package is interesting for you.

pos2 = SVector{SEGMENTS+1}(SVector(pos1[:,i]) for i in 1:SEGMENTS+1)

2 Likes

This is already nice. It reduces the allocations to one:

julia> @time include("src/test.jl")
7-element SVector{7, SVector{3, Float64}} with indices SOneTo(7):
 [1.0, 2.0, 3.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
BenchmarkTools.Trial: 
  memory estimate:  176 bytes
  allocs estimate:  1
  --------------
  minimum time:     34.508 ns (0.00% GC)
  median time:      37.620 ns (0.00% GC)
  mean time:        39.708 ns (2.43% GC)
  maximum time:     427.749 ns (83.17% GC)
  --------------
  samples:          10000
  evals/sample:     993
  5.087095 seconds (20.65 M allocations: 2.269 GiB, 27.65% gc time)

Remaining question: How to eliminate the last allocation?

Maybe

pos1 = @view part[:,:,1]

That increases the number of allocations from one to eight. I think views do not work with SVectors.

1 Like

Change MVector to SVector and use:

using Setfield
function create_state()
           pos =  zeros(SVector{SEGMENTS+1, Vec3})
           @set! pos[1] = @SVector [1.0,2,3]
           vel =  zeros(SVector{SEGMENTS+1, Vec3})
           y = reduce(vcat, vcat(pos, vel))
       end


julia> function unpack(y)
           part  = reshape(y,  Size(3, SEGMENTS+1, 2))
           pos1 = part[:,:,1]
           pos2 = @SVector [SVector(pos1[:,i]) for i in 1:SEGMENTS+1]
       end
unpack (generic function with 1 method)

julia> @btime unpack(y) setup = (y = create_state())
  4.640 ns (0 allocations: 0 bytes)
7-element SVector{7, SVector{3, Float64}} with indices SOneTo(7):
 [1.0, 2.0, 3.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]



1 Like

I cannot change the type of the input parameter of the unpack function, because this is a callback function from an ODE solver.

But you inspired me to find the following solution:’

function unpack(y)
    part = reshape(SVector{6*(SEGMENTS+1)}(y),  Size(3, SEGMENTS+1, 2))
    pos1 = part[:,:,1]
    SVector{SEGMENTS+1}(SVector(pos1[:,i]) for i in 1:SEGMENTS+1)
end

Output:

julia> include("src/test.jl")
7-element SVector{7, SVector{3, Float64}} with indices SOneTo(7):
 [1.0, 2.0, 3.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
 [0.0, 0.0, 0.0]
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     6.581 ns (0.00% GC)
  median time:      6.699 ns (0.00% GC)
  mean time:        7.011 ns (0.00% GC)
  maximum time:     38.815 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000
1 Like