Static mixed type mutable vector - programmatic access?

I created an interesting structure for my use case. I managed to get everything nice and static, but I’m having trouble using it programmatically. Here are my goals and how they guided the design.

  • Iterate tens of thousands of times over a small set of values
  • Hundreds of thousands of these sets of values are created, making me avoid @eval oriented solutions
  • The types of these values are decided at runtime and cannot be known in advance
  • Reading/writing of these values is sequential, implying that they should be kept as close as possible in memory

I ended up building this MixedTypeMVector, which abstracts an MVector that just contains some bytes. Getting and setting values is statically typed through the use of Val types.

using StaticArrays: MVector

function calc_offsets(sizes::NTuple{N,Int})::NTuple{N,Int} where {N}
    offsets_tail = cumsum(sizes)[1:end-1]
    (0, offsets_tail...)
end

struct MixedTypeMVector{T,N,B}
    vectype::Type{T}
    bytes::MVector{B,UInt8}
    offsets::NTuple{N,Int}

    function MixedTypeMVector(types::NTuple{N,DataType}) where {N}
        sizes = sizeof.(types)
        offsets = calc_offsets(sizes)
        nbytes = sum(sizes)
        bytes = MVector{nbytes,UInt8}(undef)

        typ = Tuple{types...}
        new{typ,N,nbytes}(typ, bytes, offsets)
    end
end

Base.length(::MixedTypeMVector{T,N,B}) where {T,N,B} = N

function Base.setindex!(mtmvec::MixedTypeMVector{T,N,B}, value, ::Val{IX}) where {T,N,B,IX}
    T_ = mtmvec.vectype.parameters[IX]
    offset = mtmvec.offsets[IX]
    ptr = pointer(mtmvec.bytes) + offset
    unsafe_store!(reinterpret(Ptr{T_}, ptr), value)
end

function Base.getindex(mtmvec::MixedTypeMVector{T,N,B}, ::Val{IX}) where {T,N,B,IX}
    T_ = mtmvec.vectype.parameters[IX]
    offset = mtmvec.offsets[IX]
    ptr = pointer(mtmvec.bytes) + offset
    unsafe_load(reinterpret(Ptr{T_}, ptr))
end

Writing a value is faster than for a Vector{Any}, which is cool. @code_warntype shows that this is type stable, which I assume to be the cause behind the speed.

julia> ts = (Int, Float32, UInt8)
(Int64, Float32, UInt8)

julia> mtmvec = MixedTypeMVector(ts)
MixedTypeMVector{Tuple{Int64, Float32, UInt8}, 3, 13}(Tuple{Int64, Float32, UInt8}, UInt8[0x00, 0x6e, 0xbc, 0x95, 0xf4, 0x7c, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00], (0, 8, 12))

julia> mtmvec[Val(1)] = 1
1

julia> mtmvec[Val(2)] = 0.2f0
0.2f0

julia> mtmvec[Val(3)] = UInt8(3)
0x03

julia> @benchmark mtmvec[Val(3)] = UInt8(33)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min … max):  11.744 ns …  2.089 μs  ┊ GC (min … max): 0.00% … 97.72%
 Time  (median):     13.188 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   14.276 ns ± 29.116 ns  ┊ GC (mean ± σ):  2.97% ±  1.64%

           ▂██▂                                                
  ▃▃▂▂▂▂▃▅▅████▆▃▂▂▂▂▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  11.7 ns         Histogram: frequency by time        19.3 ns <

 Memory estimate: 16 bytes, allocs estimate: 1.

julia> anyvec = Any[1, 0.2f0, UInt8(3)]
3-element Vector{Any}:
    1
    0.2f0
 0x03

julia> @benchmark anyvec[3] = UInt8(33)
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
 Range (min … max):  54.577 ns … 531.751 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     56.337 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   57.831 ns ±   9.455 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

      ▂▅▇█▇▇▇▅▄▂▁             ▂▁                   ▁▃▅▅▃       ▂
  ▄▄▃████████████▇█▇▇█▇██▇▇▇▇█████▆▆▄▄▁▅▅▁▃▄▃▃▁▁▃▅▆█████▇▅▃▃▃▃ █
  54.6 ns       Histogram: log(frequency) by time      66.9 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

However, I struggle to use it programatically. The performance guide in the documentation refers to this as well, and it’s likely because I end up abusing the Val type. Here are two functions that iterate over these types.

function mapexp_mtm(data::MixedTypeMVector{T,N,B}) where {T,N,B}
    out = Vector{Float32}(undef, N)
    for ix in 1:N
        val = data[Val(ix)]
        out[ix] = exp(val)
    end
    out
end

function mapexp_anyvec(data::Vector{Any})
    N = length(data)
    out = Vector{Float32}(undef, N)
    for ix in 1:N
        out[ix] = exp(data[ix])
    end
    out
end

Now the Vector{Any} solution is faster again.

@benchmark mapexp_mtmvec(mtmvec)
BenchmarkTools.Trial: 10000 samples with 310 evaluations.
 Range (min … max):  284.084 ns …   6.556 μs  ┊ GC (min … max): 0.00% … 92.83%
 Time  (median):     304.932 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   322.273 ns ± 225.994 ns  ┊ GC (mean ± σ):  2.89% ±  4.02%

       ▂▅▇█▇▃                                                    
  ▁▁▂▄████████▇▅▄▃▂▂▂▂▁▁▂▂▄▄▄▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  284 ns           Histogram: frequency by time          421 ns <

@benchmark mapexp_anyvec(anyvec)
 Memory estimate: 288 bytes, allocs estimate: 8.
BenchmarkTools.Trial: 10000 samples with 969 evaluations.
 Range (min … max):  79.211 ns …  2.279 μs  ┊ GC (min … max): 0.00% … 93.90%
 Time  (median):     85.538 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   92.402 ns ± 62.950 ns  ┊ GC (mean ± σ):  3.26% ±  4.71%

     ▂▃█▄                                                      
  ▁▃▅████▅▃▂▂▁▁▃██▄▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  79.2 ns         Histogram: frequency by time         139 ns <

And this is because of the type instabilities that I create.

MethodInstance for mapexp_mtm(::MixedTypeMVector{Tuple{Int64, Float32, UInt8}, 3, 13})
  from mapexp_mtm(data::MixedTypeMVector{T, N, B}) where {T, N, B} @ Main ~/projects/...
Static Parameters
  T = Tuple{Int64, Float32, UInt8}
  N = 3
  B = 13
Arguments
  #self#::Core.Const(mapexp_mtm)
  data::MixedTypeMVector{Tuple{Int64, Float32, UInt8}, 3, 13}
Locals
  @_3::Union{Nothing, Tuple{Int64, Int64}}
  out::Vector{Float32}
  ix::Int64
  val::Any
Body::Vector{Float32}
1 ─ %1  = Core.apply_type(Main.Vector, Main.Float32)::Core.Const(Vector{Float32})
│   %2  = Main.undef::Core.Const(UndefInitializer())
│         (out = (%1)(%2, $(Expr(:static_parameter, 2))))
│   %4  = (1:$(Expr(:static_parameter, 2)))::Core.Const(1:3)
│         (@_3 = Base.iterate(%4))
│   %6  = (@_3::Core.Const((1, 1)) === nothing)::Core.Const(false)
│   %7  = Base.not_int(%6)::Core.Const(true)
└──       goto #4 if not %7
2 ┄ %9  = @_3::Tuple{Int64, Int64}
│         (ix = Core.getfield(%9, 1))
│   %11 = Core.getfield(%9, 2)::Int64
│   %12 = Main.Val(ix)::Val
│         (val = Base.getindex(data, %12))
│   %14 = Main.exp(val)::Any
│         Base.setindex!(out, %14, ix)
│         (@_3 = Base.iterate(%4, %11))
│   %17 = (@_3 === nothing)::Bool
│   %18 = Base.not_int(%17)::Bool
└──       goto #4 if not %18
3 ─       goto #2
4 ┄       return out

My questions are twofold:

  • Suggestions for approaches on how to make the use of this struct more statically typed
  • Completely new ways to solve the problems that I outlined at the beginning, things I have overlooked in general

Thanks for reading!