Why [1, 2, 3] is not a Vector{Number}?

I think one of the things that leads to confusion here (including my own), is that any parameterized struct with an abstract type parameter will be a concrete type. For example, if we have the following,

struct Foo{T} end

struct Bar{T}
    x::T
end

struct Point{T}
    x::T
    y::T
end

then Foo{Real}, Bar{Real}, and Point{Real}, along with Vector{Real} and Complex{Real}, are all concrete types. (I lost sight of this fact in my post above.) Since concrete types cannot be subtyped in Julia, it’s not possible to have Bar{Float64} be a subtype of Bar{Real}.

So there seems to be a deep connection between the following two language features:

  • All parametric types are invariant (except for tuples).
  • Concrete types cannot be subtyped.

I don’t think the connection between these two features is emphasized in the manual.

However, I’m not sure if the chain of logic goes like this:

  1. Parametric types must be invariant for performance.
  2. Wrapper{SomeAbstractType} is a concrete type.
  3. Therefore concrete types cannot be subtyped.

or like this:

  1. Concrete types cannot be subtyped for performance.
  2. Wrapper{SomeAbstractType} is a concrete type.
  3. Therefore parametric types must be invariant.
5 Likes

As I recall (I could check “the archives”, ie old emails, but that would take some excavation), but I think the motivation for concrete types being final and parametric invariance was more about soundness than performance. We had both those features long before we had a fast compiler. We were, however, well aware of the problems that these two issues cause in OOP languages for both performance and usability, so it was in our minds.

I’ve made the case before that it makes no sense for concrete types to be subtypeable. When you subtype something, you are effectively saying its definition was not complete and that you are completing it by sub typing it. If the type was incomplete then you should not have been able to instantiate it, so the supertype shouldn’t be concrete. Conversely, anything you can instantiate is complete — otherwise how can you have an instance of it — so it should not be possible to make it more complete by subtyping it. If you have a thing that can be instantiate and a thing that can be subtypes, even if they look and act similar, those are two separate concepts and should be two separate types: one that is final that you can instantiate and one that you can subtype, which the other can be a subtype of. This design is often recommended in OOP design posts.

There’s a point of view in OOP that goes something like this: even though the supertype is complete and instantiate, I need to modify it by subtyping it, to change some of its behavior and/or add some data to it. Julia’s design essentially takes the position that this specific kind of subtyping is bad and should simply not be allowed. The fact that methods can be added externally majorly alleviates the need to do this. For other cases, you can do the pattern where you separate the instantiable final type from the abstract inheritable type. It may introduce a bit of redundancy but it avoids a lot of problems.

For parametric invariance, the argument is that neither covariance nor contravariance are generally correct. If you are using a Vector in a read only way, then covariance is good, but if you’re using it in a write-only way then contravariance is good. If both then only invariance is correct. We have syntax for all of these: Vector{<:Real}, Vector{>:Real} and Vector{Real}. So invariance is really only a default — the other variances can also be expressed, they’re just not the default because invariance is the only safe choice everywhere.

It’s a really good observation that since Vector{Real} is concrete it cannot have a subtype. But of course, if we had chosen for that syntax to indicate covariance, then the type would not have been concrete and it would have been possible to subtype, so I think that argument may be circular.

23 Likes

Thank you for your reply.

So it seems the critical point for performance is whether the element type is final or not, rather than whether it is abstract or concrete.

Vector{Final} is performant
Vector{NotFinal} is not efficient (boxing, pointers etc)

In your example of subtyping Float64:
Vector{Float64} would suffer the same performance issues as Vector{Real}
However, Vector{FinalSubtypeOfFloat64} would be efficient.

Yes, this is what I was trying to get at.

It seemed unsatisfying to me to say [1,2,3] is not a subtype of Vector{Number} for performance reasons alone, without further explanation.

Another chain of logic:

  1. Parametric types P{T} require T to be final for performance.
  2. All concrete types are final for simplicity.
  3. P{AbstractType} is a concrete type.
  4. Therefore parametric types are invariant

Hypothetically and probably flawed:

  1. Parametric types P{T} require T to be final for performance.
  2. All non-parametric concrete types are final for simplicity.
  3. Parametric types can be covariant

Yes, but if Vector{Real} was not concrete, you wouldn’t be able to have an instance of it.

I don’t quite get how Vector{>:Real} represents contravariance. Covariance works:

julia> Float64 <: Real
true

julia> Vector{Float64} <: Vector{<:Real}
true

But contravariance doesn’t seem to work:

julia> Float64 <: Real
true

julia> Vector{Float64} >: Vector{>:Real}
false

I think this is what you want:

julia> Vector{>:Float64} >: Vector{Real}
true

>: Real means supertypes of Real. Rather, you want supertypes of Float64 here.

4 Likes

Ok, that seems like a rather loose analogy for contravariance. By the traditional definition of contravariance, if Foo was contravariant, then

Float64 <: Real

would imply

Foo{Real} <: Foo{Float64}

But what we actually have is

Foo{Real} <: (Foo{T} where T>:Float64)

and I would say that

Foo{T} where T>:Float64

is rather different from Foo{Float64}, especially since Foo{Any} is a valid subtype of Foo{T} where T>:Float64

What I’m trying to get at here and not doing a great job of explaining is that the choice of invariance is actually more superficial than people think it is. That’s because the language actually has ways of expressing all the variances. The only way that invariance is privileged is that the syntax T{P} is invariant with respect to P rather than being covariant (or even contravariant). So let me try to clarify a bit. I’m going to write T{=:P} for the type we normally write as T{P}, i.e. the invariant parameterization of T. There three distinct types associated with any parametric type:

  1. The covariant type, T{<:P}‚ which includes T{=:S} for all S <: P
  2. The invariant type, T{=:P}, where P is fixed.
  3. The contravariant type, T{>:P}, which include T{=:S} for all S >: P.

The first and last are abstract whereas the middle is concrete. Languages may use different syntaxes for these concepts, may not have syntaxes for all of them, and may even use the same syntax for more than one of these concepts. They may even combined some of these concepts in their type system.

For example, most languages that are said to have “covariant parametric types” use the syntax T{P} to mean both T{<:P} and T{=:P} in different contexts. They also combine these distinct concepts in their type systems: the type of a vector that can any element that is a subtype of Real is Vector{Real} in such a system; but in dispatch that type also applies to any vector with an element type that is a subtype of Real. In such a language the syntax and the type Vector{Real} represent both the concrete type Vector{=:Real} when asking the type of things and the abstract type Vector{<:Real} when doing dispatch. In effect, what it means to be a language with covariant parametric types is that a language conflates the distinct concepts T{<:P} and T{:=P} both syntactically and conceptually. Perhaps I’m missing something here and someone who is an expert in these kinds of languages will be able to correct me on this.

Julia distinguishes T{<:P} from T{=:P} both syntactically and conceptually, using the syntax T{P} for the latter. The type Vector{Real} is concrete and is the type of a vector whose element type is Real. You can use this type in dispatch and write a method that only dispatches on vectors with that exact element type. The type Vector{<:Real}, on the other hand, is abstract and not the type of any value. You can, however, use this type in dispatch (or as the type annotation of a field or as parameter of another type).

The decision to not allow subtyping concrete types is another case of languages traditionally failing to distinguish different concepts. When you subtype a concrete type in a language that allows it, this is really conflating two distinct concepts:

  1. The type that is concrete, complete and can be instantiated but not subtyped
  2. The type that is abstract, incomplete and can be subtyped but not instantiated

Many OOP languages treat these distinct concepts as a single thing. In such a language, you can declare a type as final thereby precluding the second option. Otherwise you have, say, Vector{Float64} and you can’t tell if the element type is the concrete, final Float64 type or the abstract, subtypable Float64 type.

If you want to both instantiate and subtype a type in Julia, you do have an option: you make two separate type — one abstract, A and a concrete subtype of that, C <: A. Then we can distinguish, for example, Vector{C} which can only hold instances of the concrete type C, from Vector{A} which can hold instances of C but also instances of any other subtype of A. But most of the time this isn’t necessary, especially since new methods for C can be defined externally.

In general the Julian approach is primarily about keeping distinct concepts distinct both syntactically and in the type system. One aspect of this is not conflating T{<:P} and T{P}. The other aspect is forcing the user to distinguish between a concrete type that can be instantiated and abstract types that can be specialized.

15 Likes

One issue with the separate-abstract-type pattern is that you can’t introduce the abstract type from downstream code (ie you have to PR it to the original package or Base), and if you do introduce one later, you need to also update a bunch of method signatures to use it (and probably define an API around it to use to make those functions generic).

It seems like subtyping a concrete type is a bit similar to what one wants for wrapper types (ie to act like the thing being wrapped generically with a few specialised methods to tweak the behaviour). But most wrapper types probably want to add a few fields as well, and then they don’t have the same data layout, so that wouldn’t be good to allow as a subtype of a concrete type.

5 Likes

Thanks for taking the time to write this up.
You have explained it very well. I think I get it.

Ha. ha, but the more I understand, the more I realise how much I don’t understand.

1 Like

Yes, that’s very true. My fairly weak response to that is that I suspect that subtyping doesn’t work super well in such situations when it isn’t explicitly planned for, but that is kind of an anemic argument.

An example to support it is that there is an outstanding request to introduce AbstractComplex as an abstract supertype of the Complex type. Which is on the face of it super easy. The hard part is that one then has to go through every usage of Complex in base and decide if it should remain as Complex or become AbstractComplex. The obvious OOP counterpoint is “well, if they were the same type then you wouldn’t have that problem!” But of course, if Complex and all other concrete types where subtypeable you have the problem that the compiler also doesn’t know which was intended and we couldn’t have efficient inline storage of Vector{Complex{Float64}} because we wouldn’t know if that was mean to be the triply concrete type — concrete Vector, concrete Complex and concrete Float64 — or if some of those types were maybe the abstract counterpart. The Vector would need to store pointers to individually allocated Complex objects, each of which might actually be a subtype of Complex. The real and imaginary parts of the Complex value also couldn’t be stored inline since each of them might be just a Float64 or some exotic subtype of Float64, which requires a type tag to tell you what type and might require more storage if it has more data than just the usual 8 bytes.

So while it is a burden to be required to introduce the distinction between Complex and AbstractComplex in this kind of situation, I think it’s not only necessary for performance, but may also be somewhat unavoidable in that there are situations where you need to explicitly distinguish which one you are talking about for semantic reasons.

6 Likes

Very true; I was thinking maybe the data-layout part of what it means to be a type could be separated from the dispatch meaning (using the categories from https://mikeinnes.github.io/2020/05/19/types.html), so that one could subtype concrete types without changing data-layout or field names, just to give a way to add more specific methods to dispatch to. I thought that would at least not introduce performance problems, but at the cost of being useful only in narrow circumstances (since you can only use it for the simplest wrapper types that don’t add any new fields).

That’s quite a systematic review, now I also better understand the decisions involved!

As for “introducing new abstract types”, I often think when such situations arise that traits would suit the problem much better. If they were as “first-class” and convenient in the language as [sub]types are, and if all packages + Base used them. Actually, all the abstract type stuff would likely become unnecessary in this case, as traits are more general.

4 Likes

Wow, I explained the logical issue with subtyping concrete types a lot better back in 2014: Why is it impossible to subtype a struct? - #3 by StefanKarpinski. Of course, the issue was much fresher in my mind back then. Here’s what I wrote:

While there are a number of practical reasons to disallow subtyping of concrete types, I think they all stem from the basic fact that subtyping a concrete type is logically unsound. A type can only be instantiated if it is completely specified, and it only makes sense to subtype something if it is incompletely specified. Thus, a type should either be abstract or concrete but not both. What object-oriented languages that allow the same type to be both instantiated and subtyped are really doing is using the same name for two different things: a concrete type that can be instantiated and an abstract type that can be subtyped. Problems ensue since there’s no way to express which you mean. You end up with attempts to resolve this confusion like the “final” keyword and recommendations against ever subtyping classes that weren’t intended to be abstract. But wouldn’t it be better not to conflate the two things in the first place?

The point where OOP languages disagree with this analysis is that they consider it fine to subtype something that is completely specified in order to modify its behavior, rather than merely to finish specifying an incompletely specified abstraction. This highlights two very distinct kinds of subtyping:

  1. Subtyping to complete a partially implemented abstraction;
  2. Subtyping to modify an already-complete, concrete implementation.

We basically take the position that the first kind of subtyping is ok but the second kind is not and is therefore disallowed, whereas standard OOP languages allow both. One of the main ways that people want to “modify” an already-complete, concrete implementation in class-based OOP languages is by adding new methods to them, which is something that doesn’t require subtyping at all in Julia because methods are added externally. So one of the major reasons to need the second kind of subtyping is absent in Julia, making it far easier to completely forgo it.

17 Likes

I’m reluctant to necropost, but I just noticed that I never replied to this point and I wanted to address it. The P{>:T} construct does have the expected contravariant behavior:

julia> Int <: Real
true

julia> Vector{>:Real} <: Vector{>:Int}
true

This makes sense since A <: B implies that (T where T>:B) <: (T where T>:A), i.e. if A is contained in B then the collection of things that contain B is a subset of things that contain A (since B <: C implies A <: C).

4 Likes

Is anyone able to explain this behavior that is related to the op:

julia> a = [1.3,2.1]
2-element Vector{Float64}:
 1.3
 2.1

julia> a isa Vector{<:Number}
true

julia> a = [[1.3,2.1], [1.5, 2.5], [1.5,4.1,6.5]]
3-element Vector{Vector{Float64}}:
 [1.3, 2.1]
 [1.5, 2.5]
 [1.5, 4.1, 6.5]

julia> a isa Vector{Vector{<:Number}}
false

Same principle:

julia> a = [[1.3,2.1], [1.5, 2.5], [1.5,4.1,6.5]]
3-element Vector{Vector{Float64}}:
 [1.3, 2.1]
 [1.5, 2.5]
 [1.5, 4.1, 6.5]

julia> a isa Vector{<:Vector{<:Number}}
true

https://docs.julialang.org/en/v1/manual/types/#man-parametric-composite-types

5 Likes

We have:

julia> a isa Vector{Vector{T} where T<:Number}
false

julia> a isa Vector{Vector{T}} where T<:Number
true

The first one is the same as Vector{Vector{<:Number}}, where Vector{<:Number} is an abstract type.

3 Likes

Thanks @Nathan_Boyer and @sgaure… this one is hard to wrap my head around

1 Like

Another example:

b = [Number[1.3,2.1], Number[1.5, 2.5]]
# all these are true:
isa.(Ref(b), [Vector{Vector{Number}}, Vector{<:Vector{<:Number}}, 
 Vector{<:Vector{Number}}])
1 Like