Operate only on a few components of a static vector

Suppose I want to extract and operate on only a few components from an SVector, e.g.

using StaticArrays

f( (x, y, z) ) = (3x, 4y, 5z)

v = SVector( (2i for i in 1:10)...)
indices = (1, 3, 5)
v2 = getindex.(Ref(v), indices)
w = f(v2)

How do I now put the values from w back into the same positions (indices)?
I am looking for a setindex equivalent to the broadcasted getindex.
I can just write a loop, but is there a better way? Using ntuple directly?

This might be a good use case for Setfield.jl

2 Likes

I could be wrong, but I believe the Setfield solution would produce pretty similar machine code to using an MVector instead and just broadcasting setindex!. In both cases you basically just create copies of tuples with apropriate entries swapped out.

2 Likes

Broadcasting doesn’t work since it returns several copies of the vector.

It looks like the solution is just a for loop:

using StaticArrays

f( (x, y, z) ) = (3x, 4y, 5z)

v = SVector( (2i for i in 1:10)...)
indices = (1, 3, 5)

v2 = getindex.(Ref(v), indices)
result = f(v2)

for i in 1:length(indices)
    @inbounds v = setindex(v, result[i], indices[i])
end
1 Like

Is v there still SVector? or did you go with MVector?

SVector

1 Like

If you do MVector, the initial thing is mutated, so eg this works:

f( (x, y, z) ) = (3x, 4y, 5z)
v = MVector( (2i for i in 1:10)...)
indices = (1, 3, 5)
w = f(getindex.(Ref(v), indices))
setindex!.(Ref(v), w, indices)
v # [6, 4, 24, 8, 50, 12, 14, 16, 18, 20]

(alternatively, its the last thing returned by the broadcast if you want to keep in one-line)

1 Like

I have the impression that they are not the same. One is an array of static size that leaves most likely in the heap, while the immutable one will be more agressively optimized. At least my experience with Setfield is much better than with MVectors (even if it just copies the array).

I think on recent versions of Julia (maybe 1.6?), MVector can be stack allocated as well. E.g.:

julia> function foo(x)
           x[1] = 1
           x[2] = 2
           x[3] = 3
           sum(x)
       end
foo (generic function with 2 methods)

julia> x = @MVector[0,0,0];

julia> @allocated foo(x) # after warmup
0

Edit: Actually that may kind of be a non-obvious example since the equivalent Array code wouldn’t allocate either. Here’s maybe a better one:

julia> function bar()
           x = @MVector[1.,2,3]
           x .=  sin.(x)
           sum(x)
       end
bar (generic function with 1 method)

julia> @allocated bar()
0
1 Like

For longer vectors, MVector does seem to be much faster, thanks.

1 Like

Look at this. For longer vectors that changes? Is that the point?

using StaticArrays
v = SVector( (2i for i in 1:10)...)
indices = (1, 3, 5)
f( (x, y, z) ) = (3x, 4y, 5z)
function foo(f, v::StaticArray, indices)
    v2 = getindex.((v,), indices)
    w = f(v2)
    mv = MArray(v)
    # checked by `getindex`
    @inbounds for (i, idx) in enumerate(indices)
        mv[idx] = w[i]
    end
    return SArray(mv)
end

gives

julia> @btime foo(f, $v, $indices);
  8.646 ns (0 allocations: 0 bytes)
3 Likes

Here is a benchmark of various versions for different lengths.

The TL;DR seems to be: Prefer MVectors over SVectors if you need to modify an argument a lot. This goes against what I thought I “knew”. As @marius311 suggested, this may have changed in a recent version of Julia.

It seems that creating SVectors can end up being expensive; I am guessing that this is when actual memory copying is occuring?


using StaticArrays
using BenchmarkTools

"Uses setindex"
function act_on_components1(f, v::StaticArray, indices)
    v2 = getindex.(Ref(v), indices)

    result = f(v2)

    for i in 1:length(indices)
        @inbounds v = setindex(v, result[i], indices[i])
    end

    return v
end

"Converts to MArray and back"
function act_on_components2(f, v::StaticArray, indices)
    v2 = getindex.((v,), indices)
    w = f(v2)
    
    mv = MArray(v)
    
    @inbounds for (i, idx) in enumerate(indices)
        mv[idx] = w[i]
    end

    return SArray(mv)
end


"Modifies in place"
function act_on_components!(f, v, indices)
    v2 = getindex.((v,), indices)
    w = f(v2)

    @inbounds for (i, idx) in enumerate(indices)
        v[idx] = w[i]
    end

    return v

end


function run_benchmark(f, v, indices)

    println()
    @show length(v)

    println("SVector using setindex:")
    v1 = SVector(v...)
    @btime act_on_components1($f, v, $indices) setup=(v = $v1);

    println("SVector using MVector:")
    @btime act_on_components2($f, v, $indices) setup=(v = $v1);

    println("In-place MVector:")
    v2 = MVector(v...)
    @btime act_on_components!($f, v, $indices) setup=(v = $v2);

    println("In-place Vector")
    v3 = [v...]
    @btime act_on_components!($f, v, $indices) setup=(v = $v3);
end

f( (x, y, z) ) = (x + y, y + z, z + x)

run_benchmark(f, 1:10, (1, 3, 7));
run_benchmark(f, 1:20, (1, 3, 7));
run_benchmark(f, 1:50, (5, 20, 49));
run_benchmark(f, 1:100, (5, 20, 99));
run_benchmark(f, 1:200, (5, 20, 199));
run_benchmark(f, 1:500, (5, 20, 499));


using IntervalArithmetic 

println() 
println("With intervals:")
run_benchmark(f, interval.(1:10), (1, 3, 7));
run_benchmark(f, interval.(1:20), (1, 3, 7));
run_benchmark(f, interval.(1:50), (5, 20, 49));
run_benchmark(f, interval.(1:100), (5, 20, 99));
run_benchmark(f, interval.(1:200), (5, 20, 199));
run_benchmark(f, interval.(1:500), (5, 20, 499));

Results:

length(v) = 10
SVector using setindex:
  5.495 ns (0 allocations: 0 bytes)
SVector using MVector:
  6.991 ns (0 allocations: 0 bytes)
In-place MVector:
  2.911 ns (0 allocations: 0 bytes)
In-place Vector
  2.760 ns (0 allocations: 0 bytes)

length(v) = 20
SVector using setindex:
  9.502 ns (0 allocations: 0 bytes)
SVector using MVector:
  8.036 ns (0 allocations: 0 bytes)
In-place MVector:
  2.767 ns (0 allocations: 0 bytes)
In-place Vector
  2.760 ns (0 allocations: 0 bytes)

length(v) = 50
SVector using setindex:
  68.280 ns (0 allocations: 0 bytes)
SVector using MVector:
  10.691 ns (0 allocations: 0 bytes)
In-place MVector:
  3.066 ns (0 allocations: 0 bytes)
In-place Vector
  2.825 ns (0 allocations: 0 bytes)

length(v) = 100
SVector using setindex:
  161.644 ns (0 allocations: 0 bytes)
SVector using MVector:
  28.377 ns (0 allocations: 0 bytes)
In-place MVector:
  2.998 ns (0 allocations: 0 bytes)
In-place Vector
  2.770 ns (0 allocations: 0 bytes)

length(v) = 200
SVector using setindex:
  290.125 ns (0 allocations: 0 bytes)
SVector using MVector:
  58.724 ns (0 allocations: 0 bytes)
In-place MVector:
  2.546 ns (0 allocations: 0 bytes)
In-place Vector
  2.315 ns (0 allocations: 0 bytes)

length(v) = 500
SVector using setindex:
  1.047 μs (0 allocations: 0 bytes)
SVector using MVector:
  137.492 ns (0 allocations: 0 bytes)
In-place MVector:
  2.600 ns (0 allocations: 0 bytes)
In-place Vector
  2.841 ns (0 allocations: 0 bytes)

With intervals:

length(v) = 10
SVector using setindex:
  50.375 ns (0 allocations: 0 bytes)
SVector using MVector:
  24.940 ns (0 allocations: 0 bytes)
In-place MVector:
  19.855 ns (0 allocations: 0 bytes)
In-place Vector
  19.640 ns (0 allocations: 0 bytes)

length(v) = 20
SVector using setindex:
  124.472 ns (0 allocations: 0 bytes)
SVector using MVector:
  29.527 ns (0 allocations: 0 bytes)
In-place MVector:
  21.473 ns (0 allocations: 0 bytes)
In-place Vector
  21.005 ns (0 allocations: 0 bytes)

length(v) = 50
SVector using setindex:
  305.090 ns (0 allocations: 0 bytes)
SVector using MVector:
  48.689 ns (0 allocations: 0 bytes)
In-place MVector:
  20.079 ns (0 allocations: 0 bytes)
In-place Vector
  20.070 ns (0 allocations: 0 bytes)

length(v) = 100
SVector using setindex:
  635.631 ns (0 allocations: 0 bytes)
SVector using MVector:
  77.139 ns (0 allocations: 0 bytes)
In-place MVector:
  21.217 ns (0 allocations: 0 bytes)
In-place Vector
  19.850 ns (0 allocations: 0 bytes)

length(v) = 200
SVector using setindex:
  1.528 μs (0 allocations: 0 bytes)
SVector using MVector:
  147.505 ns (0 allocations: 0 bytes)
In-place MVector:
  19.644 ns (0 allocations: 0 bytes)
In-place Vector
  20.753 ns (0 allocations: 0 bytes)

length(v) = 500
SVector using setindex:
  4.519 μs (0 allocations: 0 bytes)
SVector using MVector:
  1.452 μs (0 allocations: 0 bytes)
In-place MVector:
  20.529 ns (0 allocations: 0 bytes)
In-place Vector
  20.752 ns (0 allocations: 0 bytes)
4 Likes

Are you working with a single SVector or with an Array{SVector}?

In the latter case, one trick that may help (it requires StructArrays#master) is to keep your SVectors in a StructArray with a column-wise storage, eg

julia> s = StructArray([SVector(rand(3)...) for _ in 1:10]);

julia> s[1]
3-element SVector{3, Float64} with indices SOneTo(3):
 0.9243742489785831
 0.518160330139402
 0.8958973733856339

Then you can “modify” the SVector by modifying the underlying data storage:

julia> s.:1[3] = 12 # get third element in the first component, same as `StructArrays.component(s, 1)[3] = 12`
12

julia> s[3]
3-element SVector{3, Float64} with indices SOneTo(3):
 12.0
  0.20404293877588664
  0.9256989229376644

If instead you have a unique SVector, maybe MVector is the better way. You could do some tricks with ntuple, but I think it would only be efficient up to length 16.

1 Like

That’s interesting, thanks. What I actually have (in IntervalArithmetic.jl) is an IntervalBox type that is just a thin wrapper around an SVector. But it is true that they are often in Vector{IntervalBox}.

For some problems I’m definitely interested in lengths greater than 16, even into the hundreds, and I’m looking for the “right” way to do this.