[ANN] HybridArrays.jl: arrays with both static and dynamic axes

I’ve made HybridArrays.jl, an extension to StaticArrays.jl that makes it easier to work with arrays that have both statically and dynamically sized axes. It should be as fast (or almost as fast) as Arrays of SArrays where both approaches work but HybridArrays should generate efficient code in any scenario (setindex!, getindex, broadcasting) where the size of affected part of an array can be statically determined. Additionally, it can wrap any AbstractArray and operations that are not supported by HybridArrays.jl can by forwarded to the wrapped object.

To make things even easier, the package contains an implementation of statically sized views. The package is currently used as a backend for the power manifold from Manifolds.jl.

17 Likes

Version 0.3.7 was just tagged, now with improved compatibility with the DiffEq ecosystem. Here we can see > 20x speedup with generic implicit solvers that don’t work with Arrays of SArrays, just by wrapping an array in HybridArray: https://github.com/SciML/ArrayInterface.jl/issues/58 .

9 Likes

Version 0.4.0 is out! It’s compatible with StaticArrays.jl 1.0, doesn’t reimplement most of the SubArray from the base and has better test coverage :slightly_smiling_face:. Performance may be a bit lower on Julia < 1.5 because I take advantage of memory layout changes from Julia 1.5.

2 Likes

Really nice work! Are there any plans to make HybridArrays.jl compatible with LoopVectorization.jl? When I experimented with both packages a few weeks ago, they weren’t compatible (in the sense that LoopVectorization.jl didn’t create AVX code). If I remember correctly, I’ve seen some discussions about this at different places but would like to know the current status.

1 Like

Thanks! StaticArrays.jl and HybridArrays.jl have a different approach to code generation than LoopVectorization.jl, so you have to pick which one you want. Fortunately, you can wrap any array in SizedArray or HybridArray so you should be able to easily switch back and forth.

julia> using LoopVectorization

julia> function mydotavx(a, b)
           s = 0.0
           @avx for i ∈ eachindex(a,b)
               s += a[i]*b[i]
           end
           s
       end
mydotavx (generic function with 1 method)

julia> using StaticArrays, HybridArrays

julia> a = HybridMatrix{3,StaticArrays.Dynamic()}(randn(3,100))
3×100 HybridArray{Tuple{3,StaticArrays.Dynamic()},Float64,2,2,Array{Float64,2}} with indices SOneTo(3)×Base.OneTo(100):
 -2.63771   -1.13769    0.0971186  …  1.05876    -0.381332  -0.660524
 -0.820122   0.898894  -0.051927      0.0648582  -3.05852    1.288
  0.788884  -0.291043  -0.638714      0.444826    0.340148  -0.27689

julia> mydotavx(a, a)
328.5947898164926

julia> @benchmark mydotavx($a, $a)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     30.873 ns (0.00% GC)
  median time:      30.883 ns (0.00% GC)
  mean time:        30.957 ns (0.00% GC)
  maximum time:     46.679 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     994

julia> @benchmark mydotavx($(a.data), $(a.data))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     23.488 ns (0.00% GC)
  median time:      23.499 ns (0.00% GC)
  mean time:        23.524 ns (0.00% GC)
  maximum time:     32.673 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     996

It’s possible that in the future you won’t have to unwrap manually (there was some work done here: https://github.com/SciML/ArrayInterface.jl/pull/61 ) but I don’t know any details about that. In any case HybridArrays optionally implements some parts of ArrayInterface and I’m going to keep supporting it, so I guess it’s mostly about ArrayInterface <-> LoopVectorization interoperability?

cc @Elrod

3 Likes

I was originally planning on rewriting LoopVectorization to add some important features and Julia 1.6 support at the same time, but progress on that front has been slow, so I’ve decided to first focus on just switching over the old code to the new VectorizationBase.jl and ArrayInterface.jl, without yet taking advantage of some of the new features if the associated changes are unnecessary.

I hope to make a release this weekend, but it’ll depend on how long it takes me to get the remaining tests to pass.

To get LoopVectorization support, you’ll need to define a few methods.

julia> using ArrayInterface, HybridArrays, StaticArrays

julia> a = HybridMatrix{3,StaticArrays.Dynamic()}(randn(3,100));

julia> ArrayInterface.strides(a)
ERROR: MethodError: no method matching strides(::HybridMatrix{3, StaticArrays.Dynamic(), Float64, 2, Matrix{Float64}})
Closest candidates are:
  strides(::SubArray) at subarray.jl:329
  strides(::Union{Base.ReinterpretArray{T, N, S, A, IsReshaped} where S where IsReshaped where A<:Union{SubArray{T, N, A, I, true} where I<:Union{Tuple{Vararg{Real, N} where N}, Tuple{AbstractUnitRange, Vararg{Any, N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, Base.ReshapedArray{T, N, A, MI} where MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}, N} where N} where A<:Union{Base.ReinterpretArray{T, N, S, A, IsReshaped} where S where IsReshaped where A<:Union{SubArray{T, N, A, I, true} where I<:Union{Tuple{Vararg{Real, N} where N}, Tuple{AbstractUnitRange, Vararg{Any, N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, SubArray{T, N, A, I, true} where I<:Union{Tuple{Vararg{Real, N} where N}, Tuple{AbstractUnitRange, Vararg{Any, N} where N}} where A<:DenseArray where N where T, DenseArray} where N where T, DenseArray}) at reinterpretarray.jl:160
  strides(::Base.ReinterpretArray) at reinterpretarray.jl:156
  ...
Stacktrace:
 [1] strides(A::HybridMatrix{3, StaticArrays.Dynamic(), Float64, 2, Matrix{Float64}})
   @ ArrayInterface ~/.julia/dev/ArrayInterface/src/stridelayout.jl:298
 [2] top-level scope
   @ REPL[3]:1

julia> pointer(a)
ERROR: conversion to pointer not defined for HybridMatrix{3, StaticArrays.Dynamic(), Float64, 2, Matrix{Float64}}
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] unsafe_convert(#unused#::Type{Ptr{Float64}}, a::HybridMatrix{3, StaticArrays.Dynamic(), Float64, 2, Matrix{Float64}})
   @ Base ./pointer.jl:67
 [3] pointer(x::HybridMatrix{3, StaticArrays.Dynamic(), Float64, 2, Matrix{Float64}})
   @ Base ./abstractarray.jl:1110
 [4] top-level scope
   @ REPL[4]:1

julia> ArrayInterface.size(a) # not static
(3, 100)

julia> B = @MMatrix rand(3,4);

julia> ArrayInterface.strides(B)
(ArrayInterface.StaticInt{1}(), ArrayInterface.StaticInt{3}())

julia> pointer(B)
Ptr{Float64} @0x00007f73882a7d50

julia> ArrayInterface.size(B) # static
(ArrayInterface.StaticInt{3}(), ArrayInterface.StaticInt{4}())

julia> ArrayInterface.strides(B) # static
(ArrayInterface.StaticInt{1}(), ArrayInterface.StaticInt{3}())

HybridArrays will need a few methods defined, but StaticArrays.MArrays are good to go for the next release of LoopVectorization:

julia> using VectorizationBase # master branch; needs to be 0.13

julia> stridedpointer(B) # if this function works, the array should work
VectorizationBase.StridedPointer{Float64, 2, 1, 0, (1, 2), Tuple{StaticInt{8}, StaticInt{24}}, Tuple{StaticInt{1}, StaticInt{1}}}(Ptr{Float64} @0x00007f73882a7d50, (StaticInt{8}(), StaticInt{24}()), (StaticInt{1}(), StaticInt{1}()))

The list of methods:

ArrayInterface.size(A::HybridArray) = ...
ArrayInterface.strides(A::HybridArray) = ...

Base.unsafe_convert(::Type{Ptr{T}}, A::HybridArray{S,T}) where {S,T} = pointer(A.data)

ArrayInterface.contiguous_axis(::Type{<:HybridArray}) = ArrayInterface.Contiguous{1}()
ArrayInterface.contiguous_batch_size(::Type{<:HybridArray}) = ArrayInterface.ContiguousBatch{0}()
ArrayInterface.stride_rank(::Type{<:HybridArray{<:Any,<:Any,N}}) where {N} = ArrayInterface.StrideRank{ntuple(identity,Val(N))}()

I didn’t test, but all but the first 2 should be close to something that works.

4 Likes

That sounds great! I’m looking forward to this new release and will definitely test it in our code. It’s really nice to have such a great and lively ecosystem in Julia :smiley:

Concerning the last few methods described above: Wouldn’t it make sense to forward these calls to the type of A.data when A::HybridArray is the type used in contiguous_axis etc.?

Ah, yeah, if A.data is allowed to be generic, I’d forward all the calls to it, except for ArrayInterface.size and ArrayInterface.strides, where you can provide more information.

Thanks for a detailed answer! I’ll work on adding these methods soon.

2 Likes

It looks like HybridArrays.jl supports ArrayInterface.jl only via Requires.jl. Would it make sense to make ArrayInterface.jl an explicit dependency so that users don’t have to write using ArrayInterface additionally?

Any libraries using ArrayInterface should using ArrayInterface, so I think that’s fine as is.

1 Like

Currently HybridArrays.jl supports Julia 1.0 and up, while ArrayInterface.jl only 1.2 and up. So using Requires.jl is the only way I can support ArrayInterface.jl without increasing lower bound on Julia.

So it will work if we don’t depend on ArrayInterface ourselves but are using HybridArrays and LoopVectorization? That sounds good.

Yes, it should work like that.

1 Like

I’ve added a few new ArrayInterface methods in HybridArrays 0.4.1 :slightly_smiling_face:. @avx macro improves the performance now without manual unwrapping.

4 Likes