I have a custom missing type which stores a symbol and a reference to an object whose type is not predetermined, so it’s an Any field. I need to store vectors of these, some values will be missing but most won’t be. Here’s a test function that looks at the allocations that happen when writing values and missings into such a vector.
The peculiar thing to me is that when I write ten missings unioned with Int, I get around 10 allocations, so one per missing. But if I union with Float64, every value in the vector needs an allocation. Why is the storage behavior different if both Int and Float64 are primitives with 8 bytes size?
struct MyMissing
location::Symbol
reason
end
function missing_vector(T, n)
v = Vector{Union{MyMissing,T}}(undef, n)
for i in 1:n
v[i] = i > 10 ? T(i) : MyMissing(:missing_vector, "under ten")
end
return count(x -> x isa MyMissing, v)
end
Oh yeah I heard somewhere that Julia has a cache of small Ints pre-boxed as an optimisation. So in your code no allocation needs to happen for the Ints.
However if you replace the code with T(100000+i) or make the loop bigger the difference should go away.
Yeah, indeed the difference goes away when bigger integers are used:
julia> function missing_vector(T, n)
v = Vector{Union{MyMissing,T}}(undef, n)
for i in 10000000:n+10000000-1
v[i - 10000000 + 1] = i > 10 ? T(i) : MyMissing(:missing_vector, "under ten")
end
end
missing_vector (generic function with 1 method)
julia> @time missing_vector(Int, 100)
0.000003 seconds (101 allocations: 2.438 KiB)
Oh interesting, pre-boxed integers.. Well then I picked a really unfortunate example when I checked early on if my union approach was ok in terms of allocation behavior. I thought that julia could store unions inline with type tags but maybe only if they are all isbits.
But overall it’s still very fast even allocating millions of these small values. So practically it probably doesn’t matter in my case.
As always, profile realistic uses of your code to identify the actual bottlenecks before trying to optimise anything.
Vector does have a more efficient storage for unions than boxing, but only for really small unions like Union{Nothing,Int}. I think it internally stores just the Ints inline plus has a BitVector to store which ones are actually Nothing.
That’s only for small unions where every subtype is isbits. It would be really nice to have it for pointer-ful (i.e. GC-tracked) types also. I believe that would enable implementing something like smallvec.
A related limitation is that a function that returns a union with a pointer-ful type can’t store its return values on the stack, but must heap-allocate it. This can be a problem if you lean heavily into error return types such that your code often returns unions.
I’m not sure whether these two limitations (Memory storage versus ABI for pointerful unions) are the same.