Static view

question

#1

I am slicing up a Vector{T} to pieces of heterogeneous lengths, which are known at compile-time. Getting StaticArrays would help. Does the following (much simplified for the MWE) make sense?

using StaticArrays

struct SView{L}
    start::Int
end

function Base.view(x::Vector{T}, v::SView{L}) where {T,L}
    reinterpret(SVector{L,T}, view(x, v.start:(v.start + L - 1)))
end

x = collect(1:10)
v = SView{4}(3)
view(x, v)

Also, I wonder if it would be possible to make it faster, the @code_warntype output is type-stable but quite convoluted.


#2

How about this, avoiding the view? 2ns instead of 16:

function Base.view(x::Vector{T}, v::SView{L}) where {T,L}
    SVector{L,T}(ntuple(i -> x[i+v.start-1], L))
end

#3

Excellent, this would also work for other <: AbstractVector.

For some reason I thought that just reinterpreting a contiguous piece of memory would be fastest, but maybe I am approaching this wrong.


#4

What about loading SVectors directly from your vector?

using StaticArrays

struct SView{L}
    start::Int
end

function Base.view(x::Vector{T}, v::SView{L}) where {T,L}
    reinterpret(SVector{L,T}, view(x, v.start:(v.start + L - 1)))
end

x = collect(1:10)
v = SView{4}(3)
view(x, v)

function svload(x::Vector{T}, v::SView{L}) where {T,L}
    ptr_x = Base.unsafe_convert(Ptr{SVector{L,T}}, pointer(x))
    unsafe_load(ptr_x + (v.start-1) * sizeof(T))
end
svload(x, v)

For comparison,

julia> view(x, v)
1-element reinterpret(SArray{Tuple{4},Int64,1,4}, view(::Array{Int64,1}, 3:6)):
 [3, 4, 5, 6]

julia> svload(x, v)'
1Γ—4 LinearAlgebra.Adjoint{Int64,SArray{Tuple{4},Int64,1,4}}:
 3  4  5  6

julia> @code_native svload(x, v)
	.text
; β”Œ @ REPL[12]:2 within `svload'
; β”‚β”Œ @ abstractarray.jl:882 within `pointer'
; β”‚β”‚β”Œ @ REPL[12]:2 within `unsafe_convert'
	movq	(%rsi), %rax
; β”‚β””β””
; β”‚ @ REPL[12]:3 within `svload'
; β”‚β”Œ @ int.jl:52 within `-'
	movq	(%rdx), %rcx
; β”‚β””
; β”‚β”Œ @ pointer.jl:105 within `unsafe_load' @ pointer.jl:105
	vmovups	-8(%rax,%rcx,8), %ymm0
; β”‚β””
	movq	%rdi, %rax
	vmovups	%ymm0, (%rdi)
	vzeroupper
	retq
	nopw	(%rax,%rax)
; β””

You’d probably want to add an @boundscheck to svload. Right now it isn’t safe.


#5

In order to get proper boundscheck elision, use

julia> function Base.view(x::Vector{T}, v::SView{L}) where {T,L}
          @boundscheck checkbounds(x, v.start:v.start+L)
          SVector{L,T}(ntuple(i -> (@inbounds x[i+v.start-1]), L))
       end
julia> iv(x,v)=@inbounds view(x,v)
julia> @code_native iv(x,v)
	.text
; Function iv {
	movq	(%rdx), %rax
	movq	(%rsi), %rcx
	vmovups	-8(%rcx,%rax,8), %ymm0
	vmovups	%ymm0, (%rdi)
	movq	%rdi, %rax
	vzeroupper
	retq
	nopw	(%rax,%rax)
;}
julia> view(x, SView{4}(9))
ERROR: BoundsError: attempt to access 10-element Array{Int64,1} at index [9:13]

Also note that proper vector loads are used. This is the idiom I’d recommend and use in my own code (in order to work around the current penalty for reinterprets between Vector{<:SVector} and Matrix). Apart from genericity, you don’t need to spend thought on GC visibility. E.g., does svload need a GC.@preserve? Probably yes, but I don’t want to think about it.


#6

Nice, this is more honestly a view, while mine should probably be a method of getindex. They appear to be the same speed though, if both are without bounds checks.