Why doesn't `vect` return a union-typed result if the number of elements is small?

Eg.:

julia> struct A end

julia> struct B end

julia> [A(), B()]
2-element Vector{Any}:
 A()
 B()

Would it not be better to return

julia> Union{A,B}[A(), B()]
2-element Vector{Union{A, B}}:
 A()
 B()

instead, as this may then take advantage of union splitting to improve performance?

For example:

julia> Base.:(+)(::A, ::B) = A()

julia> Base.:(+)(::B, ::A) = A()

julia> using BenchmarkTools

julia> @btime sum($([A(), B()]));
  18.967 ns (0 allocations: 0 bytes)

julia> @btime sum($(Union{A,B}[A(), B()]));
  6.709 ns (0 allocations: 0 bytes)
3 Likes

I’m not as deep into Julia specifics, just a naive answer. I think people would be confused if this doesn’t work anymore:

x = [1, "2"]
push!(x, '3')

Even more, the Union trick would probably only make sense if the number of different types is limited, so it would be case specific which type this constructor yields.

6 Likes

In this case, one should use

x = Any[1, "2"]
push!(x, '3')

It’s not clear if [1, "2"] would necessarily be a Vector{Any}, although it turns out to be the case. For example, the following doesn’t work, as it does produce a union as the eltype:

julia> x = [1, nothing];

julia> push!(x, missing)
ERROR: MethodError: Cannot `convert` an object of type Missing to an object of type Int64

julia> typeof(x)
Vector{Union{Nothing, Int64}} (alias for Array{Union{Nothing, Int64}, 1})

Even more, the Union trick would probably only make sense if the number of different types is limited

Yes, I agree, this may only be chosen as the return type for a few arguments. This is type-stable, albeit somewhat perplexing.

My question is: since this already is adopted in the special cases of nothing and missing, why not make this more universal, if only for singleton types? I had seen an issue recently where there was some discussion on nothing and missing being special, which is a bit unfortunate.

3 Likes

One challenge with doing what you’ve specifically asked is that I think it’s reasonable to want [1, 2.5] to promote both entries to Float64 rather than make a Vector{Union{Int,Float64}}. The distinction here is that promote(1,2.5) “succeeds” whereas promote(A(),B()) for your example does not.

So it seems that the preferred behavior would be to try to attempt to promote the elements first, then introduce a Union if multiple (or maybe only a few) distinct types remain.

In the meantime, perhaps this function definition will be useful to you?

unionvec(x...) = Union{typeof.(x)...}[x...]

It seems to do what you asked, although doesn’t attempt to do the promotion that I had suggested nor does it resort to Any beyond a certain number of distinct types.

2 Likes