Vector{Union{Missing, Int}}
is efficiently represented, but Vector{Union{Int, Float64}}
isn’t. Is there a predicate to distinguish between the two, similar to isconcretetype
?
How are you determining is/isnot “efficiently represented”?
I was under the impression that the optimizations for isbits
Union
s apply to Vector{Union{Int, Float64}}
too, since both Int
and Float64
are bits types. Is this not the case?
It is the case. Missing and Nothing behave a little differently in those Unions if they are present as missing
or nothing
in a vector (see below) – but not if they are in the Union type and not present in the vector. That’s why I asked “How is this being determined”?
julia> Base.summarysize(Union{Missing,Int32}[1, 2])
56
julia> Base.summarysize(Union{Missing,Int32}[1, missing])
52
julia> Base.summarysize(Union{Nothing,Int32}[1, nothing])
52
TIL, thank you for pointing me to the docs. So I suppose it boils down to checking if all the unioned types are isbits?
Maybe the more straight-forward question is: how do I know if Array{T}
is represented internally as an array of pointers?
You mean how whether it should be, or to introspect what happens for a particular T
in practice?
I think the current situation is that for the union of two (or more?) bits types, it is guaranteed to be represented efficiently, but this is an implementation detail and may be expanded later on.
Right, but for context, I’d like to fix this performance warning: Spurious warning about concrete types · Issue #287 · JuliaMath/Interpolations.jl · GitHub
Base.isbitsunion(T)
It also seems, that unions are only optimized if they are over less then five types:
julia> f(a) = first(a)/2
f (generic function with 1 method)
julia> types = filter(isbitstype,subtypes(Signed))
5-element Array{Any,1}:
Int128
Int16
Int32
Int64
Int8
julia> using BenchmarkTools
julia> for i in eachindex(types)
T = Union{types[1:i]...}
@show T
@show Base.isbitsunion(T)
arr = T[one(first(types))]
println(@btime f($arr))
end
T = Int128
Base.isbitsunion(T) = false
7.103 ns (0 allocations: 0 bytes)
0.5
T = Union{Int128, Int16}
Base.isbitsunion(T) = true
9.050 ns (0 allocations: 0 bytes)
0.5
T = Union{Int128, Int16, Int32}
Base.isbitsunion(T) = true
9.349 ns (0 allocations: 0 bytes)
0.5
T = Union{Int128, Int16, Int32, Int64}
Base.isbitsunion(T) = true
8.787 ns (0 allocations: 0 bytes)
0.5
T = Union{Int128, Int16, Int32, Int64, Int8}
Base.isbitsunion(T) = true
33.701 ns (2 allocations: 48 bytes)
0.5
What is an efficient way to determine the number of constituents in a Union of concrete types? Or, better yet, get them as a tuple quickly.
just found the answer in a post by TPapp : Base.uniontypes(x)
how about
isfastunion(::Type{T}) where {U,T<:Union{U}} =
Base.isbitsunion(T) && length(Base.uniontypes(T)) < 5
Why is that?
because if not 5
then 4or6or6ish – my guess:
it has to be < lbits_ina_byte for bitset speed without too much typeish mem overhead (which would be slowing overall). As we do not yet know how best to utilize the possibles, it was deemed smart to hold a bit or two in reserve while the better use comes to the fore. .there is more involved than altering a constant. tbd.