Getting union/tuple covariance for maps between parallel type hierarchies

Sorry, this title isn’t great but not sure what to put there :roll_eyes:

I am working on the parallel type hierarchy that’s part of ScientificTypes.jl and we have a function Scitype that maps types to types. I want the following covariance properties:

  1. the Scitype of a Tuple type should be the Tuple of the Scitypes
  2. the Scitype of a (finite) Union should be the Union of the Scitypes

My naive way of implementing 1. works as expected:

Scitype(::Type{Tuple{A,B}}) where {A,B} = Tuple{Scitype(A),Scitype(B)}

To see this, suppose we have

abstract type Continuous end
abstract type Count end

Scitype(::Type{<:Integer}) = Count
Scitype(::Type{<:AbstractFloat}) = Continuous

Then

julia> Scitype(Tuple{Int,Float64})
Tuple{Count, Continuous}

However, mimicking this for union types throws a curious error:

Scitype(::Type{Union{A,B}}) where {A,B} = Union{Scitype(A),Scitype(B)}

julia> Scitype(Union{Int,Float64})
ERROR: UndefVarError: B not defined
Stacktrace:
 [1] Scitype(#unused#::Type{Union{Float64, Int64}})
   @ Main ./REPL[13]:1
 [2] top-level scope
   @ REPL[14]:1

I realize tuples and unions are different, and am reluctant to call this a bug. But I wonder how I should implement what I want here. The following works

Scitype(u::Union) = Union{Scitype(u.a),Scitype(u.b)}

but that looks like using non-public interface.

cc @samuel_okon

2 Likes

First obvious test:

abstract type Continuous end
abstract type Count end

Scitype(::Type{<:Integer}) = Count
Scitype(::Type{<:AbstractFloat}) = Continuous

Scitype(::Type{Tuple{A,B}}) where {A,B} = Tuple{Scitype(A),Scitype(B)}

println(Scitype(Tuple{Int,Float64}))

# Scitype(::Type{Union{A,B}}) where {A,B} = Union{Scitype(A),Scitype(B)}
Scitype(::Type{Union{Int,Float64}}) = Union{Count,Continuous}

println(Scitype(Union{Int,Float64}))

prints

Tuple{Count, Continuous}
Union{Continuous, Count}
1 Like

Generalizing:

abstract type Continuous end
abstract type Count end

Scitype(::Type{<:Integer}) = Count
Scitype(::Type{<:AbstractFloat}) = Continuous

Scitype(::Type{Tuple{A,B}}) where {A,B} = Tuple{Scitype(A),Scitype(B)}

println(Scitype(Tuple{Int,Float64}))

# Scitype(::Type{Union{A,B}}) where {A,B} = Union{Scitype(A),Scitype(B)}
# Scitype(::Type{Union{Int,Float64}}) = Union{Count,Continuous}
Scitype(::Type{Union{A,B}}) where {A<:Integer,B<:AbstractFloat} = Union{Count,Continuous}

println(Scitype(Union{Int,Float64}))

still prints

Tuple{Count, Continuous}
Union{Continuous, Count}
1 Like

Excurse:

const Nested = Union{Nothing, Union{Bool, Int}}
println(Nested)

prints

Union{Nothing, Bool, Int64}

But unfortunately this

abstract type Continuous end
abstract type Count end

Scitype(::Type{<:Integer}) = Count
Scitype(::Type{<:AbstractFloat}) = Continuous

Scitype(::Type{Tuple{A,B}}) where {A,B} = Tuple{Scitype(A),Scitype(B)}

println(Scitype(Tuple{Int,Float64}))

# Scitype(::Type{Union{A,B}}) where {A,B} = Union{Scitype(A),Scitype(B)}
# Scitype(::Type{Union{Int,Float64}}) = Union{Count,Continuous}
# Scitype(::Type{Union{A,B}}) where {A<:Integer,B<:AbstractFloat} = Union{Count,Continuous}
Scitype(::Type{Union{A,B}}) where {A<:Integer,B} = Union{Count,Scitype(B)}

println(Scitype(Union{Int,Float64}))

results in

Tuple{Count, Continuous}
ERROR: LoadError: StackOverflowError:

This looks like undefined behaviour land to me. Tested on 1.6.3

1 Like

Yes, this is quite weird:

julia> foo(::Type{Union{A, B}}) where {A, B} = A;

julia> foo(Union{Int, Float64})
Union{Float64, Int64}

julia> bar(::Type{Union{A, B}}) where {A, B} = B;

julia> bar(Union{Int, Float64})
ERROR: UndefVarError: B not defined
Stacktrace:
 [1] bar(#unused#::Type{Union{Float64, Int64}})
   @ Main ./REPL[3]:1
 [2] top-level scope
   @ REPL[4]:1
2 Likes

Hi Cameron, nice idea, I didn’t test left vs. right associativity;)

abstract type Continuous end
abstract type Count end

Scitype(::Type{<:Integer}) = Count
Scitype(::Type{<:AbstractFloat}) = Continuous

Scitype(::Type{Tuple{A,B}}) where {A,B} = Tuple{Scitype(A),Scitype(B)}

println(Scitype(Tuple{Int,Float64}))

# Scitype(::Type{Union{A,B}}) where {A,B} = Union{Scitype(A),Scitype(B)}
# Scitype(::Type{Union{Int,Float64}}) = Union{Count,Continuous}
# Scitype(::Type{Union{A,B}}) where {A<:Integer,B<:AbstractFloat} = Union{Count,Continuous}
# Scitype(::Type{Union{A,B}}) where {A<:Integer,B} = Union{Count,Scitype(B)}
Scitype(::Type{Union{A,B}}) where {A,B<:AbstractFloat} = Union{Scitype(A),Continuous}

println(Scitype(Union{Int,Float64}))

prints

Tuple{Count, Continuous}
Union{Continuous, Count}

again :slight_smile:

@woclass: another issue? Filed a defect: https://github.com/JuliaLang/julia/issues/42710

Edit: feedback says: works fine on master, maybe back port required. @ablaom: should we check, what’s happening on master?

I think this has something to do with the behavior we’re seeing:

julia> ( Union{A, B} where {A, B} ) == Any
true

So, essentially, this:

foo(::Type{Union{A, B}) where {A, B} = # ...

is equivalent to this:

foo(::Type{A}) where {A} = # ...

In other words, the Union{A, B} gets simplified down to just A.

So what you are saying is

Scitype(::Type{Union{A,B}}) where {A,B<:AbstractFloat} = Union{Scitype(A),Continuous}

is supported and

Scitype(::Type{Union{A,B}}) where {A<:Integer,B} = Union{Count,Scitype(B)}

is not? That’s for the gods to decide.

It gets even weirder. This works:

julia> foo(::Type{Union{A, B}}) where {A <: Integer, B} = A, B
foo (generic function with 1 method)

julia> foo(Union{String, Int})
(Int64, String)

But if we add one more parameter to the Union, it doesn’t work:

julia> bar(::Type{Union{A, B, C}}) where {A <: Integer, B, C} = A, B, C
bar (generic function with 1 method)

julia> bar(Union{String, Int, Float64})
ERROR: UndefVarError: C not defined
Stacktrace:
 [1] bar(#unused#::Type{Union{Float64, Int64, String}})
   @ Main ./REPL[4]:1
 [2] top-level scope
   @ REPL[5]:1
1 Like

Note that Union{Int,Float64} === Union{Float64,Int} – the order of the type arguments of a Union type is not fixed. That’s why both of the following don’t work: they are ambiguous:

foo(::Type{Union{A, B}}) where {A, B} = A, B
bar(::Type{Union{A, B, C}}) where {A <: Integer, B, C} = A, B, C

Granted, the error message could be better.

Furthermore, this works:

julia> foo(::Type{Union{A, B}}) where {A <: AbstractFloat, B} = A, B
foo (generic function with 1 method)

julia> foo(Union{Int, Float64})
(Float64, Int64)

While this does not:

julia> foo2(::Type{Union{A, B}}) where {A, B <: AbstractFloat} = A, B
foo2 (generic function with 1 method)

julia> foo2(Union{Int, Float64})
ERROR: UndefVarError: B not defined

So the order in which the type restrictions apply matters. This indeed seems to be a bug?

2 Likes

There are some dark corners of the type system that are not well documented. I don’t know if the behavior in these corner cases is defined but not documented, or just not defined at all…

@CameronBieganek @goerch @fingolfin Thank you indeed for these explorations and clarifications. Although I’m still not sure how to impose the covariance without accessing internals, you have corrected some of my wrong naive expectations and discovered some fascinating “dark corners” of the type system.

I note that even my implementation for tuples doesn’t work, because it only applies to 2-tuples. I can only think of the following hack to get what I (think I) want:

Scitype(t::Type{<:Tuple}) = Tuple{Scitype.(t.parameters)...}

Perhaps even the question of imposing “tuple / union covariance” on a “maps between types” needs a more careful formulation before it can be answered :wink:

For the tuple case, you can use fieldtypes:

Scitype(t::Type{<:Tuple}) = Tuple{Scitype.(fieldtypes(t))...}
2 Likes

Looks like we should use these function of Base: https://github.com/JuliaLang/julia/blob/49e3aecd5966a2af0b064c0314cd61c1338abc00/base/promotion.jl, for example

julia> Base.typesplit(Union{Float64, Int}, Int)
Float64

This could help to solve the puzzle. And the best overview I found about the sub typing relation is https://github.com/JuliaLang/julia/blob/2388a5b4001dcd7b78becd7402420a23c3c81e91/test/subtype.jl

1 Like

Yeah, I thought about using Base.typesplit, but you have to know one of the types in order to split the other type off, which doesn’t help us in the Union{A, B} where {A, B} case. :slight_smile:

This seems to work at a first glance

abstract type Count end
abstract type Continuous end
abstract type Textual end

function Scitype(x::Type{Union{A}}) where A
    result = Union{}
    if Base.typesplit(A, Integer) != A
        result = Union{result, Count}
    end
    if Base.typesplit(A, AbstractFloat) != A
        result = Union{result, Continuous}
    end
    if Base.typesplit(A, AbstractString) != A
        result = Union{result, Textual}
    end
    result
end

println(Scitype(Int))
println(Scitype(Float64))
println(Scitype(String))
println(Scitype(Union{Int,Float64}))
println(Scitype(Union{Int,Float64,String}))

with output

Count
Continuous
Textual
Union{Continuous, Count}
Union{Continuous, Count, Textual}

Nice!

You can make one simplification to the function signature, since (Union{A} where A) == Any:

function Scitype(x::Type{A}) where A
    result = Union{}
    if Base.typesplit(A, Integer) != A
        result = Union{result, Count}
    end
    if Base.typesplit(A, AbstractFloat) != A
        result = Union{result, Continuous}
    end
    if Base.typesplit(A, AbstractString) != A
        result = Union{result, Textual}
    end
    result
end
1 Like

Hey this is great but not exactly what I had in mind. The idea is that I want to enforce the covariance independently of the other definitions, as these are added later (in my use case, in different packages).