If you statically know the size, that’s great and you should obviously use static sizes. But sometimes you can’t statically know the size, but you know it’s usually small. That’s when something like a SmallVector is useful.
I was actually playing with something like this a little while ago after @Elrod talked about how much he likes LLVM’s SmallVector. Here’s a little demo:
using StrideArrays
mutable struct SmallStore{SpillSize}
inline::NTuple{SpillSize, UInt8}
outline::Union{Vector{UInt8}, Nothing}
end;
function smallarray(::Type{T}, ::StaticInt{SpillSize}, size::Integer...) where {T, SpillSize}
@assert isbitstype(T)
N = prod(size) * sizeof(T)
inline = Ref{NTuple{SpillSize, UInt8}}()[]
if N <= SpillSize
outline = nothing
store = SmallStore{SpillSize}(inline, outline)
ptr = pointer_from_objref(store)
else
outline = Vector{UInt8}(undef, N)
store = SmallStore{SpillSize}(inline, outline)
ptr = pointer(outline)
end
ptrA = PtrArray(Ptr{T}(ptr), size)
strA = StrideArray(ptrA, store)
end;
Now here’s a benchmark function comparing this smallarray
to an array without a static lower size bound:
function foo_sa()
SpillSize = static(10*8)
if rand() < 0.95
# 95% of the time our vector fits inline
L = 10
else
# 5% of the time it'll be too big
L = 10000
end
A = smallarray(Float64, SpillSize, L)
A .= (1:L)
sum(A)
end;
function foo()
if rand() < 0.95
# 95% of the time our vector fits inline
L = 10
else
# 5% of the time it'll be too big
L = 10000
end
A = StrideArray{Float64}(undef, L)
A .= (1:L)
sum(A)
end;
julia> @benchmark foo_sa()
BenchmarkTools.Trial: 10000 samples with 997 evaluations.
Range (min … max): 138.866 ns … 3.697 μs ┊ GC (min … max): 0.00% … 78.22%
Time (median): 273.836 ns ┊ GC (median): 0.00%
Time (mean ± σ): 321.531 ns ± 176.710 ns ┊ GC (mean ± σ): 10.40% ± 15.50%
▃▆██▇▄▂
▁▁▂▃▅█████████▇▅▄▃▃▂▂▂▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
139 ns Histogram: frequency by time 932 ns <
Memory estimate: 2.05 KiB, allocs estimate: 1.
julia> @benchmark foo()
BenchmarkTools.Trial: 9078 samples with 994 evaluations.
Range (min … max): 269.175 ns … 2.844 μs ┊ GC (min … max): 0.00% … 80.24%
Time (median): 475.920 ns ┊ GC (median): 0.00%
Time (mean ± σ): 551.678 ns ± 220.005 ns ┊ GC (mean ± σ): 11.35% ± 15.79%
▁▆█▇▅
▁▁▁▁▃▅██████▆▅▃▂▂▂▂▂▂▂▃▃▄▃▃▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
269 ns Histogram: frequency by time 1.53 μs <
Memory estimate: 4.77 KiB, allocs estimate: 1.