I created an interesting structure for my use case. I managed to get everything nice and static, but Iβm having trouble using it programmatically. Here are my goals and how they guided the design.
- Iterate tens of thousands of times over a small set of values
- Hundreds of thousands of these sets of values are created, making me avoid
@eval
oriented solutions - The types of these values are decided at runtime and cannot be known in advance
- Reading/writing of these values is sequential, implying that they should be kept as close as possible in memory
I ended up building this MixedTypeMVector, which abstracts an MVector that just contains some bytes. Getting and setting values is statically typed through the use of Val types.
using StaticArrays: MVector
function calc_offsets(sizes::NTuple{N,Int})::NTuple{N,Int} where {N}
offsets_tail = cumsum(sizes)[1:end-1]
(0, offsets_tail...)
end
struct MixedTypeMVector{T,N,B}
vectype::Type{T}
bytes::MVector{B,UInt8}
offsets::NTuple{N,Int}
function MixedTypeMVector(types::NTuple{N,DataType}) where {N}
sizes = sizeof.(types)
offsets = calc_offsets(sizes)
nbytes = sum(sizes)
bytes = MVector{nbytes,UInt8}(undef)
typ = Tuple{types...}
new{typ,N,nbytes}(typ, bytes, offsets)
end
end
Base.length(::MixedTypeMVector{T,N,B}) where {T,N,B} = N
function Base.setindex!(mtmvec::MixedTypeMVector{T,N,B}, value, ::Val{IX}) where {T,N,B,IX}
T_ = mtmvec.vectype.parameters[IX]
offset = mtmvec.offsets[IX]
ptr = pointer(mtmvec.bytes) + offset
unsafe_store!(reinterpret(Ptr{T_}, ptr), value)
end
function Base.getindex(mtmvec::MixedTypeMVector{T,N,B}, ::Val{IX}) where {T,N,B,IX}
T_ = mtmvec.vectype.parameters[IX]
offset = mtmvec.offsets[IX]
ptr = pointer(mtmvec.bytes) + offset
unsafe_load(reinterpret(Ptr{T_}, ptr))
end
Writing a value is faster than for a Vector{Any}
, which is cool. @code_warntype
shows that this is type stable, which I assume to be the cause behind the speed.
julia> ts = (Int, Float32, UInt8)
(Int64, Float32, UInt8)
julia> mtmvec = MixedTypeMVector(ts)
MixedTypeMVector{Tuple{Int64, Float32, UInt8}, 3, 13}(Tuple{Int64, Float32, UInt8}, UInt8[0x00, 0x6e, 0xbc, 0x95, 0xf4, 0x7c, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00], (0, 8, 12))
julia> mtmvec[Val(1)] = 1
1
julia> mtmvec[Val(2)] = 0.2f0
0.2f0
julia> mtmvec[Val(3)] = UInt8(3)
0x03
julia> @benchmark mtmvec[Val(3)] = UInt8(33)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min β¦ max): 11.744 ns β¦ 2.089 ΞΌs β GC (min β¦ max): 0.00% β¦ 97.72%
Time (median): 13.188 ns β GC (median): 0.00%
Time (mean Β± Ο): 14.276 ns Β± 29.116 ns β GC (mean Β± Ο): 2.97% Β± 1.64%
ββββ
ββββββββ
β
ββββββββββββββββββββββββββββββββββββββββββββββββββ β
11.7 ns Histogram: frequency by time 19.3 ns <
Memory estimate: 16 bytes, allocs estimate: 1.
julia> anyvec = Any[1, 0.2f0, UInt8(3)]
3-element Vector{Any}:
1
0.2f0
0x03
julia> @benchmark anyvec[3] = UInt8(33)
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
Range (min β¦ max): 54.577 ns β¦ 531.751 ns β GC (min β¦ max): 0.00% β¦ 0.00%
Time (median): 56.337 ns β GC (median): 0.00%
Time (mean Β± Ο): 57.831 ns Β± 9.455 ns β GC (mean Β± Ο): 0.00% Β± 0.00%
ββ
ββββββ
βββ ββ βββ
β
β β
ββββββββββββββββββββββββββββββββββββββ
β
βββββββββ
ββββββββ
ββββ β
54.6 ns Histogram: log(frequency) by time 66.9 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
However, I struggle to use it programatically. The performance guide in the documentation refers to this as well, and itβs likely because I end up abusing the Val type. Here are two functions that iterate over these types.
function mapexp_mtm(data::MixedTypeMVector{T,N,B}) where {T,N,B}
out = Vector{Float32}(undef, N)
for ix in 1:N
val = data[Val(ix)]
out[ix] = exp(val)
end
out
end
function mapexp_anyvec(data::Vector{Any})
N = length(data)
out = Vector{Float32}(undef, N)
for ix in 1:N
out[ix] = exp(data[ix])
end
out
end
Now the Vector{Any}
solution is faster again.
@benchmark mapexp_mtmvec(mtmvec)
BenchmarkTools.Trial: 10000 samples with 310 evaluations.
Range (min β¦ max): 284.084 ns β¦ 6.556 ΞΌs β GC (min β¦ max): 0.00% β¦ 92.83%
Time (median): 304.932 ns β GC (median): 0.00%
Time (mean Β± Ο): 322.273 ns Β± 225.994 ns β GC (mean Β± Ο): 2.89% Β± 4.02%
ββ
ββββ
ββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββ β
284 ns Histogram: frequency by time 421 ns <
@benchmark mapexp_anyvec(anyvec)
Memory estimate: 288 bytes, allocs estimate: 8.
BenchmarkTools.Trial: 10000 samples with 969 evaluations.
Range (min β¦ max): 79.211 ns β¦ 2.279 ΞΌs β GC (min β¦ max): 0.00% β¦ 93.90%
Time (median): 85.538 ns β GC (median): 0.00%
Time (mean Β± Ο): 92.402 ns Β± 62.950 ns β GC (mean Β± Ο): 3.26% Β± 4.71%
ββββ
βββ
βββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββ β
79.2 ns Histogram: frequency by time 139 ns <
And this is because of the type instabilities that I create.
MethodInstance for mapexp_mtm(::MixedTypeMVector{Tuple{Int64, Float32, UInt8}, 3, 13})
from mapexp_mtm(data::MixedTypeMVector{T, N, B}) where {T, N, B} @ Main ~/projects/...
Static Parameters
T = Tuple{Int64, Float32, UInt8}
N = 3
B = 13
Arguments
#self#::Core.Const(mapexp_mtm)
data::MixedTypeMVector{Tuple{Int64, Float32, UInt8}, 3, 13}
Locals
@_3::Union{Nothing, Tuple{Int64, Int64}}
out::Vector{Float32}
ix::Int64
val::Any
Body::Vector{Float32}
1 β %1 = Core.apply_type(Main.Vector, Main.Float32)::Core.Const(Vector{Float32})
β %2 = Main.undef::Core.Const(UndefInitializer())
β (out = (%1)(%2, $(Expr(:static_parameter, 2))))
β %4 = (1:$(Expr(:static_parameter, 2)))::Core.Const(1:3)
β (@_3 = Base.iterate(%4))
β %6 = (@_3::Core.Const((1, 1)) === nothing)::Core.Const(false)
β %7 = Base.not_int(%6)::Core.Const(true)
βββ goto #4 if not %7
2 β %9 = @_3::Tuple{Int64, Int64}
β (ix = Core.getfield(%9, 1))
β %11 = Core.getfield(%9, 2)::Int64
β %12 = Main.Val(ix)::Val
β (val = Base.getindex(data, %12))
β %14 = Main.exp(val)::Any
β Base.setindex!(out, %14, ix)
β (@_3 = Base.iterate(%4, %11))
β %17 = (@_3 === nothing)::Bool
β %18 = Base.not_int(%17)::Bool
βββ goto #4 if not %18
3 β goto #2
4 β return out
My questions are twofold:
- Suggestions for approaches on how to make the use of this struct more statically typed
- Completely new ways to solve the problems that I outlined at the beginning, things I have overlooked in general
Thanks for reading!