Treating Union as multiple types

This can happen quite often.

f(a::AbstractVecOrMat, b::Number) = 1
f(a::AbstractVector, b::Any) = 2

julia> f([1],1)
ERROR: MethodError: f(::Vector{Int64}, ::Int64) is ambiguous. Candidates:
  f(a::AbstractVecOrMat, b::Number) in Main at REPL[75]:1
  f(a::AbstractVector, b) in Main at REPL[76]:1
Possible fix, define
  f(::AbstractVector, ::Number)
Stacktrace:
 [1] top-level scope
   @ REPL[77]:1

I read from the Julia doc that

The Julia compiler is able to generate efficient code in the presence of Union types with a small number of types [1], by generating specialized code in separate branches for each possible type.

Does that mean that, for example, if I have a method for AbstractVecOrMat, the compiler will generate two methods for AbstractVector and AbstractMatrix respectively? If that’s the case, then the above ambiguity should not happen, as the first method is meant to be seen as

f(a::AbstractVector, b::Number) = 1
f(a::AbstractMatrix, b::Number) = 1

So my question is why Julia treats Union as a solid type? Are there advantages over my understanding of how Union should work?

The method called will be always the most specific method. The ambiguity there is because Number is more specific than Any, but AbstractVector is more specific than AbstractVecOrMat. Thus, if you input a vector and a number, it is not clear if you want to call the method that is specific for the fact that the first argument is Vector (method 2) or for the fact that the second argument is a number (method 1).

No:

julia> f(a::AbstractVecOrMat, b::Number) = 1
f (generic function with 1 method)

(1 method). If that was the case, indeed, the ambiguity would not exist (as you noted):

julia> f(a::AbstractVector, b::Number) = 1
f (generic function with 1 method)

julia> f(a::AbstractMatrix, b::Number) = 1
f (generic function with 2 methods)

julia> f(a::AbstractVector, b::Any) = 2
f (generic function with 3 methods)

julia> f([1],1)
1

because the former two method are unambiguously more specific than the third one.

(but Julia will compile a specialized version of the method for the specific input types, just that doesn’t count as new methods for dispatch - at the syntax level).

2 Likes

Do you mean at the compile level, Julia does split the method? Are there good reasons that Julia doesn’t interpret that at syntax level? since that would reduce a lot of work for disambiguity.

I guess the reason is that these signatures are ambiguous.

Note that both f(a::AbstractVecOrMat, b::Number) and f(a::AbstractVector, b::Any) could give a different machine code after compilation when specialized for f(::Vector{Int}, Int) in your example.

1 Like

Yes, for the specific concrete types given.

Because that would mean different things syntactically. It is up to the programmer to decide to what method he/she wants the call to be dispatched. Otherwise f(x::Any) would not be any less generic than any other method definition.

1 Like

Can you elaborate that? Is there an example where splitting Union at the syntax level can challenge the programmer?

The situation of Any is different than Union. As for what I understand, all types except Unions form a tree where Any is the root. Union is used when different subtrees share the same codes, so when I use Union, I do intend to let the method treat those types separately, but just with the same codes.

Would that be a problem if Julia separated the codes for Unions before compilation, like how @eval work? as Union definition is quite literal and easy to interpret.

In other words, using for-@eval to iterate over types does the same thing in the compilation level as collecting the types into Union, except that @eval takes a few more lines while Union can have ambiguity problem. Why is Union not made as a shortcut to for-@eval combo?

julia> f(x::Union{Int,Float64}) = 1
f (generic function with 1 method)

julia> f(x::Int) = 2
f (generic function with 2 methods)

julia> f(1)
2

I as a programmer want to Ints to be dispatched to the second method, when it exists. Or it would be more realistic (but less verbose), to have f(x::Real) = 1.

Note that, furthermore, there is no limit to the number of subtypes of Real, in this case, thus the splitting at the syntax level does not make sense. Only when a concrete type is provided it makes sense to decide which method to use, and in this case the method that at syntax level is more specific will be used. That given, a specialized compiled code for that method for the concrete type given will be generated.

Another realistic possibility is:

julia> g(x::Union{Int,Float64}) = "Not implemented: you need to implement g for $(typeof(x))"
g (generic function with 1 method)

julia> g(x::Int) = 1
g (generic function with 2 methods)

julia> g(1)
1

julia> g(2.0)
"Not implemented: you need to implement g for Float64"
2 Likes

Thanks for that example!

If the Union got splitted, which means that the first line is g(Int);g(Float64), wouldn’t the second line overwrite the first g? Then the result should be identical. I can see that the situation is different when the two lines are swapped, but I can’t see it would be a realistic problem.

If the two lines are in the same namespace, then it’s easy to move the second line after the first. If the first line is from a dependency, then the second line overwrites and works. If the second line is from a dependency, then the Int from the first line, since it’s downstream, could be deleted to adapt to the upstream update.

I think if try to deal with all these issues you’ll arrive at the current behavior.

What if the two lines are just at the REPL, one after the other, in any order?

1 Like