Need for explicitly defining subtypes in function arguments?

Sorry I’m very new to type systems here.

I’m wondering why you need to state that your function should accept all subtypes in the class hierarchy for nodes which are defined.

julia> x = collect(1:2)
2-element Vector{Int64}:
 1
 2

julia> g(x::Array{T}) where {T<:Real} = x .* 2
g (generic function with 2 methods)

julia> g(x)
2-element Vector{Int64}:
 2
 4

Is there a case where you would ever define a type higher up in the hierarchy - e.g.,

g(x::Array{Real}) = x .* 2

where you wouldn’t want the function to apply to subtypes? I.e., why is the definition above not permitted?

2 Likes

The key point is that Array{Real} and Array{Float64} have completely different memory layouts. Array{Real} is an array of pointers to elements who’s type is unknown. By comparison Array{Float64} is simply a bunch of Float64s next to each other in memory (and you know the element types at compile time). In Julia, no concrete types have subtypes. Array{Float64} is not a subtype of Array{Real} (i.e. Julia’s type system is invariant).

7 Likes

It is

julia> g(x::Array{Real}) = x .* 2
g (generic function with 1 method)

julia> g(Real[i for i in 1:3])
3-element Vector{Int64}:
 2
 4
 6

You just need to call the method with the appropriate type:

julia> typeof(Real[i for i in 1:3])
Vector{Real} (alias for Array{Real, 1})

If it makes any difference to you, you can shorten this syntax a bit:

g(x::Array{<:Real}) = x .* 2
6 Likes

Doesn’t Float64 <: Real => true mean that a concrete type has a subtype? Though Real is not an abstract type.

I now see your argument made in Parametric Composite Types section.

https://docs.julialang.org/en/v1/manual/types/#Parametric-Types

So it’s really because the implementation in memory and performance is different that Array{Real} cannot implicitly accept Array{Float64}? I thought there is a layer of abstraction in this but I guess I was mistaken - wo what happens when g is defined as g(x::Array{<:Real})? Does it just accept Array{Float64} arguments but implements it with the inefficient memory layout?

Interesting that you can make the data type Real- while it has child nodes it’s a concrete data type, but I guess if you were to return a vector with different data types, this is useful?

Thanks for the shortcut… I suppose that makes more sense in this case.

I guess it really is about composite types and its effect on memory management, and not about subtypes strictly, since this works:

julia> a(x::Real) = x * 2
a (generic function with 1 method)

julia> a(2)
4

julia> a.(1:2)
2-element Vector{Int64}:
 2
 4

where what is passed is either an Int64 or Vector{Int64}

No:

julia> isabstracttype(Real)
true

Not sure I understand this point. In the first call you pass an Int (which is a subtype of Real), but in the second one you broadcast, so you can input an iterable, but the function itself will still only accept objects of type Real (meaning, any of its subtypes, since there cannot be an instance of Real; it is an abstract type).

julia> a(x::Real) = x * 2
a (generic function with 1 method)

julia> a(2)
4

julia> a(1:2)
ERROR: MethodError: no method matching a(::UnitRange{Int64})

# Broadcasting over anything else than `Real`s fails as well
julia> a.([0.0im, 0.0im])
ERROR: MethodError: no method matching a(::ComplexF64)
2 Likes

My understanding is that the memory layout for Array{Float64} is that way because Float64 is a primitive type whose size is known.

Is there any similar benefit in using Array{String} as opposed to Array{AbstractString}, despite the fact that String is not a primitive type?

2 Likes

yes. The benefit is that by knowing you have Strings, you know the type of data you get from the array.

2 Likes

It doesn’t have to be primitive. Immutable structs with immutable field members works as well. So you can create your own composite types that can be stored inline in arrays.

7 Likes

Didn’t realize Real was an abstract type.

Anyway my point was that Int64 <: Real is true but Array{Int64} <: Array{Real} is false is counterintuitive.

1 Like

What helped me understand the distinction between Vector{Real} and Vector{<:Real} is the insight that abstract types always denote a collection of types. So like Real is the collection of types like Int64, Float64 and so on. Then you can see that Vector{<:Real} is also a collection of types. Even clearer if you write it more explicitely as Vector{T} where T<:Real. On the other hand, Vector{Real} denotes just a single type (the vectors that store anything from the set of Real) and is thus a concrete type.

3 Likes

See:

4 Likes

One way to think of it is that Vector{Real} is a collection that can hold Ints, Floats, Rational, etc. etc. You can put all sorts of numbers into it. You cannot do that with a Vector{Int}. The Vector{Real} promises to accept for example the number 2.5. What happens if you try to put 2.5 into a Vector{Int}? That promise is broken.

All technicalities aside, from a purely intuitive standpoint, Vector{Int} does not have the properties of a Vector{Real}.

1 Like

This question is actually mentioned in the Julia FAQ: entry. The FAQ entry links to this section in the manual for further info.

1 Like

The ::Array{<:Real} type signature/annotation basically translates as “any type Array{T} for some T that subtypes Real”. In the REPL:

julia> Array{<:Real} == (Array{T} where {T<:Real})
true

Note that only abstract types can have subtypes in Julia.

To understand the above example better, also see the documentation on “UnionAll” types: 1 2.

implements it with the inefficient memory layout?

Regarding this part of the question specifically, note that Julia specializes code for the given argument types when compiling the function, so there’s no performance penalty, at least after the compilation is done. See Monomorphization - Wikipedia

1 Like

Yes, totally this! I’ve already seen this question coming up a few times here on discourse, and answers typically start from those internal/technical aspect like memory layout. Meanwhile, there’s a clear intuitive explanation that Vector{Int} shouldn’t be accepted if function declares Vector{Real} – you can only put 2.5 into the latter, not the former. Maybe there’s some canonical place to put this short explanation, so that to link easily afterwards?..

4 Likes

my bad

Thanks - very great reference to place this into proper context for a non-typist.

I understand what type-invariance is now but I don’t get your example - if I define a function with argument type Vector{Real}, I would expect it to take a collection that contains a value 2.5, which is not an Int but Float, which is still a subtype of Real. I would not expect that for a function with argument type Vector{Int}.