Vector{Vector{T} where T} vs Vector{Vector{T}} where T


#1

I can see a lot of discussions around UnionAll and new where keyword on GitHub, but most comments seem to make sense only to people who track the topic for a long time. Is there any docs or drafts (except for news) about it? In particular, I’m confused that this:

Vector{Vector{T} where T} 

is treated differently from:

Vector{Vector{T}} where T

Can somebody provide an example of what data would match the first and the second type definition?


#2

There is some discussion of this exact example in the section entitled UnionAll Types in the chapter of the Julia manual on Types, which is here:

http://docs.julialang.org/en/latest/manual/types.html#UnionAll-Types-1

– Steve Vavasis


#3

Imagine the only types in Julia happened to be Int and String, so that T had to be one of those.

Then the first would be equivalent to Vector{Union{Vector{Int},Vector{String}}} and the second to Union{Vector{Vector{Int}}, Vector{Vector{String}}}.


#4

Should this be Vector{Union{Vector{Int}, Vector{String}}}?


#5

I still don’t see what data conforms to these data specification. For example, docs describe Vector{Vector{T} where T} as:

a 1-dimensional array of 1-dimensional arrays; each of the inner arrays consists of objects of the same type, but this type may vary from one inner array to the next.

For me it sounds like e.g.:

[String[], Int[]]

But isa() shows it’s not:

julia> isa([String[], Int[]], Array{Array{T,1} where T,1})
 false

#6

Is that perhaps an error in the docs? I’d think since there is only one T there can only be one type.

Also isa([String[], Int[]], Vector{Vector{Union{String, Int}}}) is false. But that is because typeof([String[], Int[]]) is Vector{Vector{Any}}.

But I also puzzled by what @malmaud suggested

julia> isa(Vector{Vector{Union{String, Int}}}(), Vector{Vector{T} where T})
false

and @greg_plowman

julia> isa(Vector{Union{Vector{Int}, Vector{String}}}(), Vector{Vector{T} where T})
false

Are we on a wild goose chase here?


#7

Yes, fixed. Thanks!


#9

Remember the illustrative only assumption:

Imagine the only types in Julia happened to be Int and String, so that T had to be one of those.


#10

So in practice, it’s more like we have 3 types in our illustrative only type system - String, Int and Any, right? [String[], Int[]] has type Array{Array{Any,1},1} which still seems to be valid according to the definition in docs as well as @malmaud’s example. So why this:

isa(Vector{Vector{Any}}(), Vector{Vector{T} where T}) 

still gives false?


#11

Instead of [String[], Int[]], you need Union{Vector{String}, Vector{Int}}[String[], Int[]] to explicitly type the kind of vector you want to create.

Julia implements the rule that array literals with different types should be constructed as Vector{Any}, rather than Vector{Union{elements in array literal}}, which would have been a valid choice but generally not what people would want.


#12

I agree about array literals and thus used Vector{Vector{Any}}() in the last test, but perhaps I misunderstand union types in general. I played around a bit and here’s a simpler example that I don’t understand:

julia> Vector{String} <: Vector{Union{String, Int}}
false

I used to think that Vector{Union{T1, T2}} is the same as Union{Vector{T1}, Vector{T2}}, i.e. a set of all vectors of either T1 or T2. Apparently, these are 2 very different types. So what’s the meaning of Vector{Union{T1,T2}}? E.g. what’s an example value for x in:

isa(x, Vector{Union{String, Int}})

so that this expression returned true?


#13

A Vector{Union{T1,T2}} is an array whose elements can be of either type T1 or T2. e.g. the first element can be T1, the second element can be T2, and so on. For example:

julia> Union{String,Int}["x", 2]
2-element Array{Union{Int64,String},1}:
  "x"
 2   

On the other hand, a Union{Vector{T1},Vector{T2}} is either a vector of all T1 elements or a vector of all T2 elements. The vector cannot have both T1 and T2 elements in it.

The memory representations of these are very different.

  • A Vector{Union{Int,String}} must be an array of “boxes” (or pointers thereto): type tags + data. That is, for every element there must be a type tag indicating the type of that element.

  • A Vector{Int} (which is a subtype of Union{Vector{Int},Vector{String}}), is very different: the type information is attached to the array as a whole. Since every element is of type Int, it can be (and is) stored in memory as simply one Int value after another (64 bits per element if Int == Int64).

Hence, a Vector{Int} cannot be used where a Vector{Union{Int,String}} is expected, because the memory representations are totally different. The ability to have a uniformly typed array like this, where the element data is “packed” inline without individual type tags, is crucial for performance.

In 0.6 (or in 0.5 with @compat), you can declare a function parameter as Vector{<:Union{Int,String}}, and it will allow Vector{Int}, Vector{String}, or Vector{Union{Int,String}}. (Up to three different versions of your function will be compiled, specialized for the three cases, depending on what argument types you pass.)


#14

This is a very good explanation, thanks! Although I haven’t got used to all use cases of type unions, memory representation was exactly the clue I was missing.


#15

This type “invariance” is also described quite nicely in the manual:

http://docs.julialang.org/en/latest/manual/types.html#Parametric-Composite-Types-1


#16

Related, I wonder if the following two are effectively the same types: Vector{T} where T<:Union{Float64,Int64} and Union{Vector{Float64},Vector{Int64}}.

I know these two are different from Vector{Union{Float64,Int64}}, which is a concrete type that can take both Float64 and Int64 as vector entries. Therefore, this is not a supertype of Vector{Float64}, which can take only Float64 as entries:

julia> const T3 = Vector{Union{Float64,Int64}}
Array{Union{Float64, Int64},1}

julia> Vector{Float64}<:T3
false

On the other hand, the first two types are either a vector of Float64, or a vector of Int64. Therefore, they are both (abstract) supertypes of Vector{Float64}:

julia> const T1 = Vector{T} where T<:Union{Float64,Int64}
Array{T,1} where T<:Union{Float64, Int64}

julia> const T2 = Union{Vector{Float64},Vector{Int64}}
Union{Array{Float64,1}, Array{Int64,1}}

julia> Vector{Float64}<:T1
true

julia> Vector{Float64}<:T2
true

However, they are different types:

julia> T1 == T2
false

Are there any practical differences between T1 and T2? If there are, which one is more preferable in what situations? Or, are they effectively the same, so which one to use is just a matter of taste?


#17

They are not, since there are many other types that matches Vector{T} where T<:Union{Float64,Int64} that doeesn’t match Union{Vector{Float64},Vector{Int64}}. e.g. Vector{Union{}}, Vector{Union{Float64,Int64}}.


#18

Interesting. I didn’t know that Union{} is a subtype of any union type. In general, for two concrete types T and S, what are the subtypes of Union{T1,T2}? Are T1, T2, Union{}, Union{T1,T2} the only ones, or there are something else?


#19

Union{} is a subtype of any type.

I don’t know any other types that are subtype of Union{T1,T2} but I’m not sure…