Adding a type parameter annotation symbol that preserves the original bound

Given a defined composite type with multiple constrained type parameters, e.g.:

julia> struct myT{T1<:Real, T2<:Integer} end

When we reference its type signature elsewhere (e.g., for a method definition), it’s common to specify only one (or part) of the parameters. When the one(s) to be specified (constrained) is not the first, we often resort to <:Any as a placeholder for the unspecified parameter(s):

julia> foo(arg::myT{<:Any, Int}) = (arg,)
foo (generic function with 1 method)

However, in this case, myT{<:Any, Int} ends up being a UnionAll that is no longer a subtype of the original type myT which is practically sufficient to bound all the concrete type instances of myT.

julia> (myT{<:Any, Int}) == (myT{T1, Int} where {T1})
true

julia> myT{<:Any, Int} <: myT
false

IMHO, not only is it conceptually jarring because UnonAll is supposed to represent all possible (hence legal) types tied to the original parametric type myT, but it may create confusion for the user of the parametric type who did not define myT in the first place. Even if they are the owner (i.e., they defined myT), they would have to code the full type signature everywhere else manually:

julia> foo2(arg::myT{<:Real, Int}) = (arg,)

I think a potential remedy for this “issue” is to provide a separate syntax :<: (or any other preferred symbol) such that:

myT{:<:, :<:} == myT{<:Real, <:Integer} == myT

This solution is similar to an open issue on GitHub, where a syntax sugar is proposed to conveniently ignore some type parameters. The subtle difference is that the original bound of the parameter is always preserved by :<: in this case.

2 Likes

Here’s a more relevant issue:

4 Likes

I find it confusing too, but I can’t really say there is a reason to impose the original type’s constraints on a UnionAll. myT{<:String, Int} accurately describes the set of all myT where the first parameter subtypes String and the second parameter is Int. It just so happens that the set has no usable types. Similarly, the only usable types in myT{<:Any, Int} are myT{<:Real, Int}.

I assume the more useful applications are annotations like myT{:<:, Int} or myT{S, Int} ... where {S:<:}. The bigger problem here is that there isn’t one original bound in the general case. Say we introduce a supertype

abstract superT{T1<:Union{Integer, Rational}, T2<:Integer} end
struct myT{T1<:Real, T2<:Integer} <: superT end

Now myT’s bounds aren’t enough to narrow down to usable types, we need to check superT as well. Generally, types check the parameter bounds of the entire supertype hierarchy up to Any, and there isn’t always a neat type intersection to compute in advance (typeintersect even warns you).

Of course, the sane practice is to manually maintain or narrow down parameters for subtypes (interactively, dump(::Type) helps) so we can look at our definitions without checking the whole type hierarchy. Assuming this, we could simplify :<: to adopting the immediate type definition’s bounds with no worries of running into unusable types.

I would argue that people should know and write down the exact type annotation in each method definition if they want that clarity, especially when method definitions stray from the parent modules. Making :<: automatically adopt parameter bounds requires an active session and reflection, it can’t help for reading source code. :<:, _ or other easy syntax forces me to take the extra step of looking up at least one type definition, just like <:Any or omitting trailing parameters would.

2 Likes

I find it confusing, too, but I can’t really say there is a reason to impose the original type’s constraints on a UnionAll

I would argue that this is only accurate when the upper bounds of the type parameters are all abstract types. In reality, we can define types like

struct myT2{T1<:Union{Int, Float64}, T2<:Union{Float64, ComplexF64}}
    a::T1
    b::T2
end

that form a finite set of type instances. In this case, myT2{:<:, Float64} dictates all usable subtypes. This is unlike the example you provided:

abstract superT{T1<:Union{Integer, Rational}, T2<:Integer} end
struct myT{T1<:Real, T2<:Integer} <: superT end

Now myT’s bounds aren’t enough to narrow down to usable types, we need to check superT as well. Generally, types check the parameter bounds of the entire supertype hierarchy up to Any, and there isn’t always a neat type intersection to compute in advance (typeintersect even warns you).

Moreover, when parametric types are used as type parameters for other parametric types, the type signature with partial constraints becomes much more verbose. This usually happens when the outer parametric is a container-like struct where you want to have a finite “union” of allowed element types:

julia> const VecMyT2{T<:myT2} = Vector{T}
Vector{T} where T<:myT2 (alias for Array{T, 1} where T<:myT2)

julia> VecMyT2{myT2{<:Any, Float64}}
ERROR: TypeError: in Array, in T, expected T<:myT2, got Type{myT2{<:Any, Float64}}

julia> VecMyT2{myT2{<:Union{Int, Float64}, Float64}}
Vector{myT2{<:Union{Float64, Int64}, Float64}} (alias for Array{myT2{<:Union{Float64, Int64}, Float64}, 1})

Lastly, for your final comment:

I would argue that people should know and write down the exact type annotation in each method definition if they want that clarity, especially when method definitions stray from the parent modules.

I agree that people should know/express precisely how the type parameters are constrained when they intend to. In an ideal world, I would suggest throwing an immediate AssertionError during the compile time if someone writes down a myT{<:String, Int} or myT{<:Any, Int} because they either are or contain invalid (unusable) type instances. That’s also why I brought up the (IMO) annoying instances of UnionAll.

However, enforcing this would not only introduce breaking changes but also make the code more verbose with the more nested parametric types you define in your code. That’s why I suggest introducing :<:, which functions as parameter omission with the loosest valid constraint to differentiate from {<:Any}.

As for inspecting what constraint exactly does :<: fall back to, I think we can always use help?> to easily inspect the type definition:

search: myT2

  No documentation found for private symbol.

  Summary
  ≡≡≡≡≡≡≡

  struct myT2{T1<:Union{Float64, Int64}, T2<:Union{Float64, ComplexF64}}

  Fields
  ≡≡≡≡≡≡

  a :: T1<:Union{Float64, Int64}
  b :: T2<:Union{Float64, ComplexF64}

If myT2 subtypes abstract type superT2{T1<:Float64, T2<:Float64}, then you run into the same problem as my earlier example. It seems silly to even have parameters there, but subsets of larger Unions wouldn’t seem so silly. Finite sets aren’t a sufficient condition for :<: narrowing down to only valid types, it’s entirely down to whether subtypes maintain or narrow type parameter bounds.

I wish, and honestly I don’t know why this wasn’t decided early on; I currently assume there was some niche or internal reason. If type definitions threw a TypeError when you don’t maintain or narrow down the type parameters in subtypes, thus guaranteeing what now is only best practice, throwing TypeErrors at invalid types become easier, and throwing TypeErrors (instead of AssertionError) at overly wide UnionAlls become just as easy AND guaranteed.

If myT2 subtypes abstract type superT2{T1<:Float64, T2<:Float64}, then you run into the same problem as my earlier example.

What do you mean by “the same problem”? Do you mean now myT2 has the same issue as

Now myT ’s bounds aren’t enough to narrow down to usable types, we need to check superT as well.

First, I want to double-check how you actually relate two types. You actually meant

struct myT{T1<:Real, T2<:Integer} <: superT{T1, T2} end

instead of

struct myT{T1<:Real, T2<:Integer} <: superT end

in the original case, right?

So, in the case of myT2, such a relation translates to

abstract type superT2_1{T1, T2} end

struct myT2_1{T1<:Union{Int, Float64}, T2<:Union{Float64, ComplexF64}} <:  superT2_1{T1, T2}
    a::T1
    b::T2
end

or

abstract type superT2_2{T1<:Float64, T2<:Float64} end

struct myT2_2{T1<:Union{Int, Float64}, T2<:Union{Float64, ComplexF64}} <:  superT2_2{T1, T2}
    a::T1
    b::T2
end

for whether the parametric supertype superT2 has narrower bounds for the shared parameters T1 and T2 with its parametric subtype myT2. This is actually a stronger condition than “having another supertype that subtypes Any.”

For the former case (myT2_1 and superT2_1), myT2_1 still solely determines all the possible concrete type instances:

julia> myT2_1(1, 1.0)
myT2_1{Int64, Float64}(1, 1.0)

julia> myT2_1(1.0, 1.0)
myT2_1{Float64, Float64}(1.0, 1.0)

julia> myT2_1(1, 1.0im)
myT2_1{Int64, ComplexF64}(1, 0.0 + 1.0im)

julia> myT2_1(1.0, 1.0im)
myT2_1{Float64, ComplexF64}(1.0, 0.0 + 1.0im)

For the latter case (myT2_2 and superT2_2), you indeed will have to check the bounds of T1 and T2 provided by superT2_2 as well (which I assume is what you intended to point out):

julia> myT2_2(1, 1.0)
ERROR: TypeError: in superT2_2, in T1, expected T1<:Float64, got Type{Int64}

Personally, I would argue that we should not be allowed to successfully define myT2_2 in the first place because its type signature

myT2_2{T1<:Union{Int, Float64}, T2<:Union{Float64, ComplexF64}}

does not enclose a type set (for the parameters T1 and T2) that’s strictly the subset of superT2_2’s type signature

superT2_2{T1<:Float64, T2<:Float64}

Ironically, although you cannot construct myT2_2(1, 1.0)::myT2_2{Int, Float64}, if you are not the owner of myT2_2 and had decided to inspect its parameter bounds using help?>, you would get “wrong” (or one could argue “unrealistic”) type information about myT2_2 and superT2_2:

help?> myT2_2
search: myT2_2 myT2_1

  No documentation found for private symbol.

  Summary
  ≡≡≡≡≡≡≡

  struct myT2_2{T1<:Union{Float64, Int64}, T2<:Union{Float64, ComplexF64}}

  Fields
  ≡≡≡≡≡≡

  a :: T1<:Union{Float64, Int64}
  b :: T2<:Union{Float64, ComplexF64}

  Supertype Hierarchy
  ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡

  myT2_2{T1<:Union{Float64, Int64}, T2<:Union{Float64, ComplexF64}} <: superT2_2{T1<:Union{Float64, Int64}, T2<:Union{Float64, ComplexF64}} <: Any

This misleading result indicates that, even in the internal development team of Julia, there might be a misconception/inconsistency in understanding the inheritance relation between subtypes and supertypes sharing the same type parameters but different parameter bounds…

This is only legal when superT was defined with no type parameters (is a DataType, not a UnionAll), so you are right about that. I think you generally have the right gist of my argument too.

It’s actually not misleading, that’s how subtyping UnionAlls (iterated unions of types) work. The types are a bit long so I’ll use type aliases for shorthands:

julia> begin
       # myT2_2's constraints are too wide
       superT2_2_excess = superT2_2{<:Union{Float64, Int64}, <:Union{Float64, ComplexF64}}

       # superT2_2's constraints are narrower
       myT2_2_subset = myT2_2{<:Float64, <:Float64}

       [myT2_2 <:        superT2_2   myT2_2 <:        superT2_2_excess
        myT2_2_subset <: superT2_2   myT2_2_subset <: superT2_2_excess]
       end
2×2 Matrix{Bool}:
 0  1
 1  1

Although every instance of myT2_2 will also be an instance of superT2_2, myT2_2 defines a UnionAll that contains invalid types and thus does not subtype superT2_2, but rather an excessive UnionAll based on (and supertype of) superT2_2. Again, not a problem if we follow best practices and narrow parameter bounds in subtype definitions (in this case, we define myT2_2 to be equivalent to myT2_2_subset instead).

Personally, I think since the proposals to use _ for currying aren’t going anywhere, I’d rather we use _ for this.

i.e. myT{_, Int} would be myT{(T where T<:Real), Int}.

I say it’s misleading not because it’s inconsistent with

but because it shows a falsely larger set of type instances of myT2_2 the user can construct. The user would not have been able to foresee that

myT2_2{Int, Float64}

is illegal just from help?> until they actually run a code like

julia> myT2_2(1, 1.0)
ERROR: TypeError: in superT2_2, in T1, expected T1<:Float64, got Type{Int64}

I’m not convinced syntax is worth it for this — either as :<: or _ — because the most common way I’ve run into this trouble is when capturing T as a method parameter.

Frankly, I just don’t assert bounds on parametric types because it doesn’t really gain you much anyhow; it’s just as easy to check it in a constructor if you need.

There’s no guarantee that every type described by an iterated union doesn’t throw a TypeError, that’s why there’s no such assertion. We can’t foresee it, but it’s not a falsehood, just a fact of the type system deviating from set theory for practical reasons. Reals subset the complex numbers, but Real does not subtype Complex because Complex is a composite of 2 Reals. The set of all pigs that fly is the empty set, but an iterated union containing only erroring types is not equal to Union{}. We just have to take care to not define them; something like :<: can help save some typing, but it can’t do anything about poor subtyping practices.

I’m no expert in compiler engineering, so please correct me if I’m wrong. Does enforcing a one-directional hierarchy of type parameter bound (such as a composite type like myT2_1 that encloses a finite set of type instances without checking its supertypes.) not help the compiler generate more efficient code in cases like more aggressive union splitting or isbits Union Optimization?

myT2_1 describes 4 types (2x2), so that might just go past union-splitting’s limit. I’ll check with a single type parameter, 2 possible types:

julia> struct A{T<:Union{Int64, Float64}} a::T end

julia> struct B b::A end # abstract annotation usually unstable

julia> geta(x::B) = x.b.a
geta (generic function with 1 method)

julia> @code_warntype geta(B(A(1)))
MethodInstance for geta(::B)
  from geta(x::B) @ Main REPL[14]:1
Arguments
  #self#::Core.Const(Main.geta)
  x::B
Body::Union{Float64, Int64}
1 ─ %1 = Base.getproperty(x, :b)::A
│   %2 = Base.getproperty(%1, :a)::Union{Float64, Int64}
└──      return %2

I had to do that little indirection in a field of another struct for the compiler to not specialize over the actual instance’s concrete type, and it does work (a small typo really misled me there). But abstract annotations of fields are intended for limiting compilation, not optimizing type inference or performance. Unless there is a semantic reason for it, I wouldn’t make such a tight constraint at definition, there are plenty of better ways to optimize struct A{T} a::T end.

I’m not saying a syntax sugar like :<: or _ will resolve this issue. However, I cannot agree with you that

 it’s not a falsehood

even though there is a well-understood cause for the non-guaranteed inference of all valid type instances beforehand.

The reality is that any user that only learns the type signature of a composite type through help?> at the current stage is prone to construction errors. Thus, help?> is misleading in this regard.

In the vast majority of common uses, the struct’s parameter is the concretely specified type — no union splitting needed. The bounds don’t give any extra information. Yes, it’s theoretically possible that the compiler might use this information to recover from a type instability in some situations, but the gains there are often minimal.

1 Like

Thanks for the explanation!

Are developers prone to making excessive constraints at definitions in reality? That’s what you need for help?> to suggest invalid types. The bigger risk seems to be reading method type annotations as you mentioned first, but where clauses don’t affect help mode showing the type definition.

julia> struct X{T<:Real} end

help?> X{<:Any}
  No documentation found for private symbol.

  Summary
  ≡≡≡≡≡≡≡

  struct X{T<:Real}

The point of bringing up excessive constraints wasn’t to declare reflection misleading and users error-prone, it was to point out that :<: adopting the type definition’s constraints cannot simply assume it bounds only valid types because the type definition itself may bound invalid types under bad practice. The possibility of potholes does not assert that the roads are littered with them.