I’m wondering why you need to state that your function should accept all subtypes in the class hierarchy for nodes which are defined.
julia> x = collect(1:2)
2-element Vector{Int64}:
1
2
julia> g(x::Array{T}) where {T<:Real} = x .* 2
g (generic function with 2 methods)
julia> g(x)
2-element Vector{Int64}:
2
4
Is there a case where you would ever define a type higher up in the hierarchy - e.g.,
g(x::Array{Real}) = x .* 2
where you wouldn’t want the function to apply to subtypes? I.e., why is the definition above not permitted?
The key point is that Array{Real} and Array{Float64} have completely different memory layouts. Array{Real} is an array of pointers to elements who’s type is unknown. By comparison Array{Float64} is simply a bunch of Float64s next to each other in memory (and you know the element types at compile time). In Julia, no concrete types have subtypes. Array{Float64} is not a subtype of Array{Real} (i.e. Julia’s type system is invariant).
So it’s really because the implementation in memory and performance is different that Array{Real} cannot implicitly accept Array{Float64}? I thought there is a layer of abstraction in this but I guess I was mistaken - wo what happens when g is defined as g(x::Array{<:Real})? Does it just accept Array{Float64} arguments but implements it with the inefficient memory layout?
Interesting that you can make the data type Real- while it has child nodes it’s a concrete data type, but I guess if you were to return a vector with different data types, this is useful?
Thanks for the shortcut… I suppose that makes more sense in this case.
Not sure I understand this point. In the first call you pass an Int (which is a subtype of Real), but in the second one you broadcast, so you can input an iterable, but the function itself will still only accept objects of type Real (meaning, any of its subtypes, since there cannot be an instance of Real; it is an abstract type).
julia> a(x::Real) = x * 2
a (generic function with 1 method)
julia> a(2)
4
julia> a(1:2)
ERROR: MethodError: no method matching a(::UnitRange{Int64})
# Broadcasting over anything else than `Real`s fails as well
julia> a.([0.0im, 0.0im])
ERROR: MethodError: no method matching a(::ComplexF64)
It doesn’t have to be primitive. Immutable structs with immutable field members works as well. So you can create your own composite types that can be stored inline in arrays.
What helped me understand the distinction between Vector{Real} and Vector{<:Real} is the insight that abstract types always denote a collection of types. So like Real is the collection of types like Int64, Float64 and so on. Then you can see that Vector{<:Real} is also a collection of types. Even clearer if you write it more explicitely as Vector{T} where T<:Real. On the other hand, Vector{Real} denotes just a single type (the vectors that store anything from the set of Real) and is thus a concrete type.
One way to think of it is that Vector{Real} is a collection that can hold Ints, Floats, Rational, etc. etc. You can put all sorts of numbers into it. You cannot do that with a Vector{Int}. The Vector{Real} promises to accept for example the number 2.5. What happens if you try to put 2.5 into a Vector{Int}? That promise is broken.
All technicalities aside, from a purely intuitive standpoint, Vector{Int} does not have the properties of a Vector{Real}.
The ::Array{<:Real} type signature/annotation basically translates as “any type Array{T} for some T that subtypes Real”. In the REPL:
julia> Array{<:Real} == (Array{T} where {T<:Real})
true
Note that only abstract types can have subtypes in Julia.
To understand the above example better, also see the documentation on “UnionAll” types: 12.
implements it with the inefficient memory layout?
Regarding this part of the question specifically, note that Julia specializes code for the given argument types when compiling the function, so there’s no performance penalty, at least after the compilation is done. See Monomorphization - Wikipedia
Yes, totally this! I’ve already seen this question coming up a few times here on discourse, and answers typically start from those internal/technical aspect like memory layout. Meanwhile, there’s a clear intuitive explanation that Vector{Int} shouldn’t be accepted if function declares Vector{Real} – you can only put 2.5 into the latter, not the former. Maybe there’s some canonical place to put this short explanation, so that to link easily afterwards?..
I understand what type-invariance is now but I don’t get your example - if I define a function with argument type Vector{Real}, I would expect it to take a collection that contains a value 2.5, which is not an Int but Float, which is still a subtype of Real. I would not expect that for a function with argument type Vector{Int}.