Efficiently selecting parts of a StaticArray

package

#1

Is there a convenient and efficient way of slicing StaticArrays, or let’s say, dropping parts of an SArray?

julia> vs = SVector{4}(rand(4))
4-element SVector{4,Float64}:
 0.580171
 0.719607
 0.651051
 0.430415

julia> vs[1:3]
3-element Array{Float64,1}:
 0.580171
 0.719607
 0.651051

Slicing like this works, but returns an ordinary Array and isn’t particularly fast:

julia> @btime $vs[1:3]
  52.049 ns (3 allocations: 192 bytes)
3-element Array{Float64,1}:
 0.580171
 0.719607
 0.651051

julia> @btime SVector($vs[1], $vs[2], $vs[3])
  2.030 ns (0 allocations: 0 bytes)
3-element SVector{3,Float64}:
 0.580171
 0.719607
 0.651051

I get that the compiler cannot figure out the type of vs[1:3], but if I could tell it to "drop the last element of vs" somehow, then it would know to the return type SVector{3}.

I also need to do this for SVector{3}, SMatrix{4,4} and SMatrix{3,3}. I can of course write several methods like this:

droplast(vs::SVector{4}) = SVector{3}(vs[1], vs[2], vs[3])

but it seems like there must be a more general approach, perhaps one that is already in use?


#2

Use SVector all the way.

julia> vs = SVector{4}(rand(4))
4-element SVector{4,Float64}:
 0.658262 
 0.502845 
 0.0820214
 0.3283   

julia> idx = SVector(1,2,3)
3-element SVector{3,Int64}:
 1
 2
 3

julia> vs[idx]
3-element SVector{3,Float64}:
 0.658262 
 0.502845 
 0.0820214

julia> @btime $vs[$idx];
  2.335 ns (0 allocations: 0 bytes)

#3

That’s excellent! I’m still missing a step, though. Do you know how I could create an index vector 1:N-1 to be able to drop the last index for a size N vector. I tried this:

function droplast(v::SVector{N, <:Number}) where {N}
    ind = @SVector [i for i in 1:N-1]
    return v[ind]
end

but this looks for N in global scope, so doesn’t work. Interestingly, @SVector fill(1, N-1) and some others behave nicely inside functions, so perhaps adding something like @SVector 1:N-1 could be a feature request to StaticArrays.jl?

For now, perhaps a @generated function would do the trick?


#4

Something like:

@generated function droplast{N,T}(v::SVector{N, T})
           ind = SVector{N-1,Int64}([i for i in 1:N-1]...)    
           :(return v[$ind])
end

works here (on Julia latest 0.7.0-DEV.515)


#5

Great! This works. In fact, it’s possible to simplify a bit:

@generated function droplast{N, T}(v::SVector{N, T})
   ind = SVector{N-1}(1:N-1)
   return :(v[$ind])
end

#6

Or without generated (but a `@pure instead):

julia> Base.@pure pure_sub(a::Int, b::Int) = a - b
pure_sub (generic function with 1 method)

julia> function droplast(v::SVector{N, <:Number}) where {N}
           ind = SVector{pure_sub(N, 1)}(1:N-1)
           return v[ind]
       end
droplast (generic function with 1 method)

julia> @btime droplast($a)
  6.216 ns (0 allocations: 0 bytes)
4-element SVector{4,Float64}:
 0.945968 
 0.0451724
 0.917798 
 0.368511 

#7

Without @generated and without Base.@pure:

julia> droplast(x::Tuple) = argdroplast(x...)
droplast (generic function with 1 method)
julia> argdroplast(x) = ()
argdroplast (generic function with 1 method)
julia> argdroplast(x,y...) = (x,argdroplast(y...)...)
argdroplast (generic function with 2 methods)
julia> droplast(v::SVector) = SVector(droplast(v.data))
droplast (generic function with 2 methods)

Now benchmarking:

julia> @btime droplast2($sv)    # with @generated
  1.705 ns (0 allocations: 0 bytes)
julia> @btime droplast3($sv)    # with Base.@pure
  6.373 ns (0 allocations: 0 bytes)
julia> @btime droplast($sv)     # defined above
  2.022 ns (0 allocations: 0 bytes)

But the numbers can confuse, @code_native shows the same code is produced for the @generated version and the one above.

Added bonus is the droplast for Tuples which could be used elsewhere.


#8

Looping from 2 : 15 (with force inline on the recursive functions):

  Generated   Pure        Recursion
  1.791 ns    4.154 ns    1.790 ns 
  1.792 ns    6.123 ns    2.086 ns 
  2.086 ns    6.068 ns    1.792 ns 
  2.087 ns    6.217 ns    1.793 ns 
  2.087 ns    6.926 ns    2.088 ns 
  2.088 ns    7.097 ns    2.088 ns 
  2.089 ns    7.106 ns    2.090 ns 
  2.386 ns    7.985 ns    2.385 ns 
  2.092 ns    7.750 ns    2.092 ns 
  3.270 ns    8.873 ns    3.270 ns 
  2.387 ns    8.286 ns    2.387 ns 
  3.559 ns    9.753 ns    3.559 ns 
  2.679 ns    8.575 ns    2.690 ns 
  2.688 ns    10.047 ns   8.657 μs <----

Looks like the generated one and tuple is the same speed except for n > 15 when the tuple one gives up. Pure is slower but doesn’t freak out at n = 15.


#9

Lots of really interesting answers here. The ideal solution would be if StaticArrays and the @SVector macro could support StepRanges, e.g.

ind = @SVector 2:2:N-1

etc., where N is a type parameter.

Any idea if this is even possible in theory?


#10

If you use N as always meaning the length of the vector and only use literals in the range, it should be possible.


#11

OK. I may raise that as a feature request in StaticArrays.jl. Seems to me like it could be useful.

Actually, it seems like a special case of the comprehension, which does not quite work as wanted yet: https://github.com/JuliaArrays/StaticArrays.jl#svector