Flattening view of `Vector{SVector{N}}`

Tamas_Papp · November 17, 2023, 11:47am

Suppose that v is a Vector of SVector (a concrete type), eg

using StaticArrays
v = [rand(SVector{3}) for _ in 1:10]

Is there a package that implements a non-allocationg view equivalent to reduce(vcat, v)::Vector?

Note that

Iterators.flatten kind of does this, but not special-cased for SVectors, so size calculation goes through all elements
I can use vec(SplitApplyCombine.combinedimsview(v)), but again that does not specialize to SVectors.

In my actual application length(v) is large. I can implement this easily, just asking if there is a package already.

jishnub · November 17, 2023, 11:51am

Does reinterpret(Float64, v) not do what you want?

Tamas_Papp · November 17, 2023, 11:58am

yes, it does, thank you!

foobar_lv2 · November 17, 2023, 12:31pm

Also consider b = ccall(:jl_reshape_array, Array{Float64,1}, (Any, Any, Any), Array{Float64,1}, v, (sizeof(v)>>3,)), if the reinterpret array gives you performance trouble on read/write access.

At the very least, benchmark jl_reshape against reinterpret.

If reinterpret-array gives you perf trouble, then best ask around here for help.

The jl_reshape is really what you want, except that it lies to the compiler about TBAA, in a way that may or may not lead to UB / miscompile in very specific circumstances in current, past or future versions of julia. (I’m insufficiently up-to-date about the details)

The code for reinterpretArray is, on a naive reading, a complete performance-killing abomination. In many cases, llvm manages to remove the extraneous frills and the end result is OK; in some cases, llvm fails to optimize that, and you get a big slowdown.

lmiq · November 17, 2023, 1:55pm

What you mean, afaiu it just creates a new container for the same data. Thus, is usually very fast:

julia> x = rand(SVector{3,Float64}, 10^6);

julia> @btime reinterpret(Float64, $x);
  2.534 ns (0 allocations: 0 bytes)

foobar_lv2 · November 17, 2023, 2:07pm

Yes, creation of the reinterpret array is fast, but accessing the same data through the new ReinterpretArray is sometimes very slow.

It is unfortunate that afaiu Base does not supply a safe and reliably semi-performant way of doing that (the current ReinterpretArray is very hit-and-miss performance-wise, because it tries to work on the level of AbstractArray instead of the level of raw memory).

julia> using StaticArrays,BenchmarkTools

julia> v = [rand(SVector{8}) for _ in 1:1024]; r = reinterpret(Float64, v); b = ccall(:jl_reshape_array, Array{Float64,1}, (Any, Any, Any), Array{Float64,1}, v, (sizeof(v)>>3,));

julia> @benchmark sum(b)
BenchmarkTools.Trial: 10000 samples with 181 evaluations.
 Range (min … max):  592.331 ns …  1.384 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     615.978 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   624.047 ns ± 42.785 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▄▆▇██▇▆▅▃▂▂▁▁                                               ▂
  ████████████████▆▅▁▅▄▅▅▅▆▆▆▇▆▆▅▆▇██▇▇▇▆▆▇▆▅▆▆▃▄▃▆▅▆▆▅▅▆▅▆▇▇▇ █
  592 ns        Histogram: log(frequency) by time       851 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

julia> @benchmark sum(r)
BenchmarkTools.Trial: 10000 samples with 4 evaluations.
 Range (min … max):  7.024 μs …  18.987 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     7.041 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.117 μs ± 651.430 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▂▁                                                         ▁
  ███▇█▇▇▇▇▇▆▅▄▃▃▃▁▁▁▄▁▁▁▃▃▁▃▁▃▁▃▁▁▁▁▁▁▁▁▃▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▃▃▃▅ █
  7.02 μs      Histogram: log(frequency) by time      10.5 μs <

 Memory estimate: 16 bytes, allocs estimate: 1.

lmiq · November 17, 2023, 2:15pm

Are you aware of any open issue regarding that? If there isn’t probably it is a good idea to fill one.

rafael.guerra · November 17, 2023, 2:30pm

FWIW, the TensorCast.jl flattening view has the same performance on access as the ccall():

using StaticArrays
v = [rand(SVector{8}) for _ in 1:1024]
r = reinterpret(Float64, v)
b = ccall(:jl_reshape_array, Array{Float64,1}, (Any, Any, Any), Array{Float64,1}, v, (sizeof(v)>>3,))

using TensorCast
@cast c[j⊗i] := v[i][j] 

@btime sum($r)      # 5.367 μs (0 allocs: 0 bytes)
@btime sum($b)      # 730 ns (0 allocs: 0 bytes)
@btime sum($c)      # 729 ns (0 allocs: 0 bytes)

foobar_lv2 · November 17, 2023, 2:45pm

Lots of them, cf Issues · JuliaLang/julia · GitHub

Keno occasionally tries to fix Reinterpret performance by coaxing llvm to propertly optimize yet another case, but that’s whack-a-mole.

Only reliably fast way is “interpret this chunk of memory in the following way” (and yes, this means that strided views or transpose matrices fundamentally should not be reinterpreted! Because you don’t reinterpret arrays / sequences of numbers, you reinterpret ranges of raw memory. And sure this will expose rather sharp edges to users in relation to structure padding).

But that approach sits in the the unfortunate position of having to wait for proper formalization of julia’s aliasing model / TBAA for several years.

If you want to help by either writing a fix or increasing awareness / pestering people, godspeed!

(I personally use jl_reshape, sprinkle @noinline and pray that the compiler doesn’t get smart enough to call me out)

foobar_lv2 · November 17, 2023, 3:00pm

It doesn’t. It creates a copy:

julia> v= [rand(SVector{8}) for _ in 1:1024]; @cast c[j⊗i] := v[i][j] ;

julia> pointer(v), pointer(c)
(Ptr{SVector{8, Float64}} @0x00000000043ab280, Ptr{Float64} @0x0000000004907a00)

In many cases, the entire point of the flattened view is that it becomes possible to mutate single entries (i.e. you reinterpret Vector{SVector} as Matrix).

mkitti · November 17, 2023, 5:13pm

@Elrod has a package that handles the strided arrays case:

rafael.guerra · November 18, 2023, 7:39am

Thanks @fooobar_lv2. And sorry for the bad TensorCast.jl pointer, as the expression with the := does not produce a view in this case.

Topic		Replies	Views
Reinterpreting vectors into vector of SVectors? General Usage staticarrays , structarrays	5	323	February 7, 2024
Convert Vector{SVector} to Matrix Performance staticarrays	20	5524	March 6, 2023
Static view General Usage question	5	1176	January 20, 2019
Create a vector of SVectors from a Matrix General Usage package	7	928	August 30, 2018
Reinterpret a Matrix as a Vector of SVectors New to Julia question	2	926	September 2, 2018

Flattening view of `Vector{SVector{N}}`

Related topics