I created an interesting structure for my use case. I managed to get everything nice and static, but I’m having trouble using it programmatically. Here are my goals and how they guided the design.
- Iterate tens of thousands of times over a small set of values
- Hundreds of thousands of these sets of values are created, making me avoid
@eval
oriented solutions - The types of these values are decided at runtime and cannot be known in advance
- Reading/writing of these values is sequential, implying that they should be kept as close as possible in memory
I ended up building this MixedTypeMVector, which abstracts an MVector that just contains some bytes. Getting and setting values is statically typed through the use of Val types.
using StaticArrays: MVector
function calc_offsets(sizes::NTuple{N,Int})::NTuple{N,Int} where {N}
offsets_tail = cumsum(sizes)[1:end-1]
(0, offsets_tail...)
end
struct MixedTypeMVector{T,N,B}
vectype::Type{T}
bytes::MVector{B,UInt8}
offsets::NTuple{N,Int}
function MixedTypeMVector(types::NTuple{N,DataType}) where {N}
sizes = sizeof.(types)
offsets = calc_offsets(sizes)
nbytes = sum(sizes)
bytes = MVector{nbytes,UInt8}(undef)
typ = Tuple{types...}
new{typ,N,nbytes}(typ, bytes, offsets)
end
end
Base.length(::MixedTypeMVector{T,N,B}) where {T,N,B} = N
function Base.setindex!(mtmvec::MixedTypeMVector{T,N,B}, value, ::Val{IX}) where {T,N,B,IX}
T_ = mtmvec.vectype.parameters[IX]
offset = mtmvec.offsets[IX]
ptr = pointer(mtmvec.bytes) + offset
unsafe_store!(reinterpret(Ptr{T_}, ptr), value)
end
function Base.getindex(mtmvec::MixedTypeMVector{T,N,B}, ::Val{IX}) where {T,N,B,IX}
T_ = mtmvec.vectype.parameters[IX]
offset = mtmvec.offsets[IX]
ptr = pointer(mtmvec.bytes) + offset
unsafe_load(reinterpret(Ptr{T_}, ptr))
end
Writing a value is faster than for a Vector{Any}
, which is cool. @code_warntype
shows that this is type stable, which I assume to be the cause behind the speed.
julia> ts = (Int, Float32, UInt8)
(Int64, Float32, UInt8)
julia> mtmvec = MixedTypeMVector(ts)
MixedTypeMVector{Tuple{Int64, Float32, UInt8}, 3, 13}(Tuple{Int64, Float32, UInt8}, UInt8[0x00, 0x6e, 0xbc, 0x95, 0xf4, 0x7c, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00], (0, 8, 12))
julia> mtmvec[Val(1)] = 1
1
julia> mtmvec[Val(2)] = 0.2f0
0.2f0
julia> mtmvec[Val(3)] = UInt8(3)
0x03
julia> @benchmark mtmvec[Val(3)] = UInt8(33)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 11.744 ns … 2.089 μs ┊ GC (min … max): 0.00% … 97.72%
Time (median): 13.188 ns ┊ GC (median): 0.00%
Time (mean ± σ): 14.276 ns ± 29.116 ns ┊ GC (mean ± σ): 2.97% ± 1.64%
▂██▂
▃▃▂▂▂▂▃▅▅████▆▃▂▂▂▂▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
11.7 ns Histogram: frequency by time 19.3 ns <
Memory estimate: 16 bytes, allocs estimate: 1.
julia> anyvec = Any[1, 0.2f0, UInt8(3)]
3-element Vector{Any}:
1
0.2f0
0x03
julia> @benchmark anyvec[3] = UInt8(33)
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
Range (min … max): 54.577 ns … 531.751 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 56.337 ns ┊ GC (median): 0.00%
Time (mean ± σ): 57.831 ns ± 9.455 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▂▅▇█▇▇▇▅▄▂▁ ▂▁ ▁▃▅▅▃ ▂
▄▄▃████████████▇█▇▇█▇██▇▇▇▇█████▆▆▄▄▁▅▅▁▃▄▃▃▁▁▃▅▆█████▇▅▃▃▃▃ █
54.6 ns Histogram: log(frequency) by time 66.9 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
However, I struggle to use it programatically. The performance guide in the documentation refers to this as well, and it’s likely because I end up abusing the Val type. Here are two functions that iterate over these types.
function mapexp_mtm(data::MixedTypeMVector{T,N,B}) where {T,N,B}
out = Vector{Float32}(undef, N)
for ix in 1:N
val = data[Val(ix)]
out[ix] = exp(val)
end
out
end
function mapexp_anyvec(data::Vector{Any})
N = length(data)
out = Vector{Float32}(undef, N)
for ix in 1:N
out[ix] = exp(data[ix])
end
out
end
Now the Vector{Any}
solution is faster again.
@benchmark mapexp_mtmvec(mtmvec)
BenchmarkTools.Trial: 10000 samples with 310 evaluations.
Range (min … max): 284.084 ns … 6.556 μs ┊ GC (min … max): 0.00% … 92.83%
Time (median): 304.932 ns ┊ GC (median): 0.00%
Time (mean ± σ): 322.273 ns ± 225.994 ns ┊ GC (mean ± σ): 2.89% ± 4.02%
▂▅▇█▇▃
▁▁▂▄████████▇▅▄▃▂▂▂▂▁▁▂▂▄▄▄▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
284 ns Histogram: frequency by time 421 ns <
@benchmark mapexp_anyvec(anyvec)
Memory estimate: 288 bytes, allocs estimate: 8.
BenchmarkTools.Trial: 10000 samples with 969 evaluations.
Range (min … max): 79.211 ns … 2.279 μs ┊ GC (min … max): 0.00% … 93.90%
Time (median): 85.538 ns ┊ GC (median): 0.00%
Time (mean ± σ): 92.402 ns ± 62.950 ns ┊ GC (mean ± σ): 3.26% ± 4.71%
▂▃█▄
▁▃▅████▅▃▂▂▁▁▃██▄▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
79.2 ns Histogram: frequency by time 139 ns <
And this is because of the type instabilities that I create.
MethodInstance for mapexp_mtm(::MixedTypeMVector{Tuple{Int64, Float32, UInt8}, 3, 13})
from mapexp_mtm(data::MixedTypeMVector{T, N, B}) where {T, N, B} @ Main ~/projects/...
Static Parameters
T = Tuple{Int64, Float32, UInt8}
N = 3
B = 13
Arguments
#self#::Core.Const(mapexp_mtm)
data::MixedTypeMVector{Tuple{Int64, Float32, UInt8}, 3, 13}
Locals
@_3::Union{Nothing, Tuple{Int64, Int64}}
out::Vector{Float32}
ix::Int64
val::Any
Body::Vector{Float32}
1 ─ %1 = Core.apply_type(Main.Vector, Main.Float32)::Core.Const(Vector{Float32})
│ %2 = Main.undef::Core.Const(UndefInitializer())
│ (out = (%1)(%2, $(Expr(:static_parameter, 2))))
│ %4 = (1:$(Expr(:static_parameter, 2)))::Core.Const(1:3)
│ (@_3 = Base.iterate(%4))
│ %6 = (@_3::Core.Const((1, 1)) === nothing)::Core.Const(false)
│ %7 = Base.not_int(%6)::Core.Const(true)
└── goto #4 if not %7
2 ┄ %9 = @_3::Tuple{Int64, Int64}
│ (ix = Core.getfield(%9, 1))
│ %11 = Core.getfield(%9, 2)::Int64
│ %12 = Main.Val(ix)::Val
│ (val = Base.getindex(data, %12))
│ %14 = Main.exp(val)::Any
│ Base.setindex!(out, %14, ix)
│ (@_3 = Base.iterate(%4, %11))
│ %17 = (@_3 === nothing)::Bool
│ %18 = Base.not_int(%17)::Bool
└── goto #4 if not %18
3 ─ goto #2
4 ┄ return out
My questions are twofold:
- Suggestions for approaches on how to make the use of this struct more statically typed
- Completely new ways to solve the problems that I outlined at the beginning, things I have overlooked in general
Thanks for reading!