I don’t understand this:
julia> isa([1,2,3], Vector{Number})
false
Each 1, 2, or 3 is certainly a Number:
julia> isa(1, Number)
true
It is illogical to say [1, 2, 3] is not Vector{Number} when each element is a Number.
I don’t understand this:
julia> isa([1,2,3], Vector{Number})
false
Each 1, 2, or 3 is certainly a Number:
julia> isa(1, Number)
true
It is illogical to say [1, 2, 3] is not Vector{Number} when each element is a Number.
See the relevant manual section: Types · The Julia Language
Just to extract some relevant points because the manual is quite dense:
Vector{Number}
is a concrete type, and [1, 2, 3]
has type Vector{Int}
, which is also a concrete type. One concrete type is never a subtype of another concrete type, they are the leaves of the type tree.
Vector{Number}
is concrete, even though Number
is not a concrete type. That’s because it has a concrete implementation which can store all types that are subtypes of Number
, it has a specific memory layout etc. On the other hand, AbstractVector{Int}
is the other way around and not a concrete type, because the container is abstract even though the element is concrete.
What you can do instead is [1, 2, 3] isa Vector{<:Number}
which is true
. That’s because <:Number
is a sort of placeholder which means “any type which is a subtype of Number”. This is often needed for dispatching on containers where you want to allow set of element types. f(x::Vector{Number})
can take only arguments of type Vector{Number}
, whereas g(x::Vector{<:Number})
can take e.g. Vector{Int}
, Vector{Float64}
, Vector{Real}
, Vector{Number}
, etc.
Wow. Thanks Jules. This is a good summary. I wasn’t able to make any sense of the reference manual, frankly speaking (besides yep it is what it is).
isa([1,2,3], Vector{T} where T<:Number)
is also true
.
There is also a nice video about subtyping: JuliaCon 2018 | Subtyping made friendly | Francesco Zappa Nardelli - YouTube
There is a very simple explanation why you actually wouldn’t want isa([1,2,3], Vector{Number})
to be true
. [1,2,3]
just cannot generally be used where a Vector{Number}
is expected. Suppose you have a function:
f(vec::Vector{Number}) = push!(vec, 1.5)
It works with Vector{Number}
:
vn = Vector{Number}([1, 2, 3])
f(vn)
But it would fail if one could pass Vector{Int}
to it:
g(vec) = push!(vec, 1.5)
vi = [1, 2, 3]
g(vi) # error - cannot push 1.5 to vector of ints
@sairus7, thanks, really nice video except for the sound quality (had to use a sound booster extension to Chrome). It seems that Francesco Nardelli’s talk covers the material in this paper.
All these explanations fall into “it is what it is”. Mr. Nardelli explained in the video “It is a language design choice”.
Which is OK. I suspect it’s for performance in dispatching. Will a Vulcan agree it is the logical thing to define? Probably not.
A = [1, 2, 3]
Is A a vector? Yes.
julia> isa(A, Vector)
true
Are all elements of A Numbers? Yes.
julia> all(x -> isa(x, Number), A)
true
Yet, A is not a Vector{Number}
julia> isa(A, Vector{Number})
false
As I said, I really don’t have a problem remembering the fact A is not a Vector{Number} and remember to use Vector{<:Number} to get what I want. So no offense. Just a discussion of programming design and general logic thinking. Sometimes if we identify a language design choice is not natural, then perhaps many unintentional bugs may arise from it.
Digression: Reminds me of a Chinese philosophy: White Horse is Not a Horse
Thanks.
apalvin:
Thanks.
Slightly different from what I had in mind. Your example is destructive. An auto promotion of some kind. Not what I’m expecting. Good example.
The original question is a simple non-destructive subsumption relationship definition question.
It’s perfectly feasible to design the language to not allow auto promotion but allows the other case.
Thx
Forgot to…
Wish you all happy new year.
Generally it is not very useful to consider programming language design decisions from the perspective of being “logical” — you can’t really derive them from basic principles of logic, so these criteria usually just reflect user expectations. But those are very heterogeneous, and depend on your prior experience and what languages you were exposed to, among other things.
The important thing is that once you are aware of how this works in Julia, it is easy to remember and applies consistently through the type system.
There is no type-promotion in my example, as far as I understand. It just shows a function that works for Vector{Number}
but fails for Vector{Int}
. This means one cannot substitute the latter for the former.
It’s not just arbitrary choice though.
An object can be covariant only on types it “produces” and contravariant only on types it “consumes”.
E.g., a function is covariant on its return type (i.e., we can use a function returning an Int
anywhere we expect a function returning a Number
) and contravariant on argument type (i.e., we can use a function accepting a Number
anywhere we expect a function accepting an Int
).
The problem with Vector
specifically (and other mutable containers), as pointed out by @aplavin, is that it can be used both as producer and consumer. So, the answer to [1, 2, 3] isa Vector{Number}
should depend on whether you want to read from or write to the vector (not necessarily push!
, just an assignment of a vector element raises the same issue) if we want co- or contravariant parameters. In Julia, it happens that type parameter invariance improves performance, but a programmer can emulate co-/contravariance if needed by explicitly using <:
or >:
.
Julia’s subtyping is based on Set logic. In set logic it’s definitely possible that
isa([white horse], Vector{<:Horse}) == true
but
isa([white horse], Vector{Horse}) == false
.
The concept of subtyping is often defined via the Liskov substitution principle. Quoting from Wikipedia:
Let \phi (x) be a property provable about objects x of type T. Then \phi (y) should be true for objects y of type S where S is a subtype of T.
Speaking in looser terms, we can phrase the substitution principle as follows:
A few more definitions that are relevant (from a language agnostic point of view):
Vector
is a type constructor that takes a type T
and creates a new type Vector{T}
.I
is a type constructor, B <: A
, and I{B} <: I{A}
, then I
is covariant.I
is a type constructor, B <: A
, and I{A} <: I{B}
, then I
is contravariant.I
is not covariant or contravariant (or bivariant), then I
is invariant. In other words, B <: A
does not imply any subtype relationship for I{B}
and I{A}
.So, if Cat <: Animal
, is Vector{Cat} <: Vector{Animal}
? Consider the following two properties of a Vector{Animal}
:
getindex
to read an element from the vector returns an Animal
.setindex!
can be used to set the value on an element of the vector to any Animal
, for instance a Dog
.A Vector{Cat}
satisfies the first property—reading an element of a Vector{Cat}
returns a Cat
, which is an Animal
. However, a Vector{Cat}
does not satisfy the second property—you can’t set an element of a Vector{Cat}
to a value with type Dog
. Thus, by the substitution principle, Vector{Cat}
cannot be a subtype of Vector{Animal}
. So the Vector
type constructor needs to be invariant.
Notice that the issue arises because of property #2, in other words, because vectors are mutable. If vectors were immutable, then it would be reasonable to have Vector{Cat} <: Vector{Animal}
, i.e., for Vector
to be covariant. In fact, tuples are immutable and are covariant:
julia> Int <: Number
true
julia> Tuple{Int} <: Tuple{Number}
true
However, it’s not quite accurate to use the above argument to explain why Vector{T}
is invariant in T
. The real reason is because all parametric types in Julia are invariant (except for tuples). Vector{T}
just happens to be one case of a parametric type. To borrow an example from the manual, we could define an immutable Point
type, like so:
struct Point{T}
x::T
y::T
end
The Point
type is immutable, so one could legitimately ask why Point{Float64}
is not a subtype of Point{Real}
. In other words, why isn’t Point{T}
covariant in T
? The manual does provide some explanation:
In other words, in the parlance of type theory, Julia’s type parameters are invariant, rather than being covariant (or even contravariant). This is for practical reasons: while any instance of
Point{Float64}
may conceptually be like an instance ofPoint{Real}
as well, the two types have different representations in memory:
- An instance of
Point{Float64}
can be represented compactly and efficiently as an immediate pair of 64-bit values;- An instance of
Point{Real}
must be able to hold any pair of instances ofReal
. Since objects that are instances ofReal
can be of arbitrary size and structure, in practice an instance ofPoint{Real}
must be represented as a pair of pointers to individually allocatedReal
objects.
So, if I interpret that correctly, that means that the main reason that parametric types are invariant in Julia is due to performance considerations. However, there could be more to the story—I’d be interested to hear a more detailed explanation of the design decision.
Languages like C#, Scala, and OCaml have a syntax to declare whether custom types are covariant, contravariant, or invariant. For example, Scala prefixes type variables with +
or -
to indicate covariant or contravariant, respectively. If Julia had syntax like that, then we could do something like this:
struct ImmutableVector{+T,N}
x
end
and then we would have ImmutableVector{Float64} <: ImmutableVector{Real}
.
Yes, it’s critical for performance. See my explanation here: Problem with Complex{Rationals} - #7 by stevengj
Whilst I understand the performance differences, I don’t fully get this:
If you have a compiled function that operates on an Complex{Real} or an array of Complex{Real}, therefore, it is completely different from code that operates on Complex{Rational{Int}}. You can’t use either one for the other. (If A is a subtype of B, it essentially means “compiled code for B can be used on instances of A” — this is the purpose of the subtype relation, to decide what code is applicable to what types.)
Doesn’t Julia specialize implementations of function methods for each concrete argument type?
But Julia doesn’t allow subtyping from concrete types. Presumably there are good reasons for this. (naively, it seems it would create complications).
Subtype relation is used for dispatch, not compiled code?
Conceptually, if Julia allowed subtyping from concrete types, would there be a performance issue? Would the specialization handle this correctly?
So maybe the reason for invariant parametric types is a bit more complicated?
I am not sure what that would look like.
Imagine what this would mean for working with an array a
of type Vector{Float64}
, for example:
Currently, this can be stored as a consecutive sequence of 8-byte chunks in memory, and if the compiler sees a[i] + 1.5
it knows that it can compile this into a load instruction and a floating-point addition.
If Float64
could be sub-typed, however, then the elements of a
could be any subtypes of Float64
— they would have to be pointers to “boxes” containing a type tag (to say what type they really are) along with the actual data, and for an expression like a[i]+1.5
the compiler would need to generate code that chases the pointer, looks at the type tag, and does dynamic (runtime) dispatch to the correct +
operation.
Moreover, dynamic dispatch in a multiple-dispatch language like Julia is even more expensive than dynamic dispatch in a single-dispatch (traditional OOP) language like C++ (where they have vtables), so devirtualization (figuring what what methods to call at compile-time) is a critical optimization, to the point where nearly all optimized Julia code is devirtualized in practice. Making concrete types final (non-subtypable) gives the compiler many more chances for devirtualization. Even in C++, people often recommend final
classes to enable devirtualization.
Because multiple dispatch allows you to add new functions to existing types (see also this discussion and this talk on the expression problem of OOP), there is much less need for subtyping of concrete types in Julia than there is in OOP languages. Because of that, it was a practical choice to make all concrete types “final”, reaping vast benefits in performance as well as simplifying the language.
I think the boolean question you are trying to ask is actually this:
julia> typeof([1,2,3]) <: Vector{T} where T <: Number
true
or maybe simpler:
julia> isa([1,2,3], Vector{<:Number})
true
Also see this:
julia> isa(Number[1,2,3], Vector{Number})
true