Vector of Vectors: Type Issue and Tuple Question

Nathan_Boyer · September 15, 2020, 2:25pm

I am trying to understand the difference between the two ways I found to create a vector of vectors below. I found Option 1 in this post, which works nicely but isn’t very customizable. Option 2 also works, but it can’t seem to obtain the type. Why aren’t the types the same between Option 1 and 2, and what does “where T,1” mean? How can I fix Option 2’s typing, so that I don’t take a performance hit?

Bonus Question: My understanding is that tuples are faster than vectors because they are immutable. How can I construct a tuple of vectors and then fill the vectors in a loop? Neither replacing [] with () in Option 1 nor Vector with Tuple in Option 2 worked. I see in my testing that I may modify the interior vectors v[j][i]=, but not the tuples indices themselves v[j]= once created. However, I can’t figure out how to set up this structure if I don’t know the values in the vectors at creation time. How can I achieve this and is the performance gain worth the hassle?

#Vector Lengths
J=4
I=11

#Creation Option 1
v=[Vector{Int}(undef, I) for _ = 1:J]

#Creation Option 2
v=Vector{Vector}(undef, J) # Outer Vector (May in practice want outer type to be a tuple.)
for j in 1:J
    v[j]=Vector{Int}(undef, I) # Inner Vector (May in practice have different lengths.)
end

#Fill with test values.
for j=1:J
    for i=1:I
        v[j][i]=(j-1)*I+i
    end
end

#Show Results
v

Output of Option 1:

4-element Array{Array{Int64,1},1}:
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
 [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]
 [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33]
 [34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44]

Output of Option 2:

4-element Array{Array{T,1} where T,1}:
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
 [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]
 [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33]
 [34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44]

jules · September 15, 2020, 2:28pm

The type of option 2 is different because you specified v=Vector{Vector}(undef, J). The inner Vector is an abstract type, because to be concrete, a Vector needs a type parameter. The outer one has that, the inner doesn’t. You could have written Vector{Vector{Int}} instead.

jules · September 15, 2020, 2:33pm

Also, the difference between a Tuple and a Vector as a collection of things is not just that one is immutable, it’s also that the Tuple type tells the compiler exactly how many elements it has. That can make a big difference for compiling efficient code (you know how many steps unrolled loops over all the elements must have, for example). So you wouldn’t use a Tuple in a context where code can produce wildly different numbers of elements that need to be stored, because for each new amount of elements, there’s a new Tuple type and therefore a new compilation overhead for methods that work on it.

Nathan_Boyer · September 15, 2020, 2:53pm

Ah, okay. This version of Option 2 now gives equivalent results to Option 1 with concrete types.

#Creation Option 2
v=Vector{Vector{Int}}(undef, J)
for j in 1:J
    v[j]=Vector(undef, I) # May in practice have different lengths.
end

I thought that the type of the inner vector could be picked up when I defined its length, but apparently all the types need to be defined at creation time?

jules · September 15, 2020, 3:03pm

I think even in that version you have a problem, because the Vector you create is of abstract type. It just works because it gets converted to a Vector{Int} when you assign it as an element of a Vector{Vector{Int}}, which can not hold anything other than Vector{Int} elements. But that conversion could cost you.

stevengj · September 15, 2020, 3:03pm

The issue is not the types of the inner vectors, the issue is the type of the container. A Vector{Vector} is different from a Vector{Vector{Int}} because the former is a container for any type of a vector. For example:

julia> vvi = Vector{Int}[] # a Vector{Vector{Int}}
Array{Int64,1}[]

julia> vv = Vector[] # a Vector{Vector}
Array{T,1} where T[]

julia> push!(vvi, [1,2,3])
1-element Array{Array{Int64,1},1}:
 [1, 2, 3]

julia> push!(vv, [1,2,3]) # vv can also store integer vectors
1-element Array{Array{T,1} where T,1}:
 [1, 2, 3]

julia> vvi[1]
3-element Array{Int64,1}:
 1
 2
 3

julia> vv[1] # this element is also a Vector{Int} == Array{Int64,1}
3-element Array{Int64,1}:
 1
 2
 3

julia> push!(vv, ["a", "b", "c"]) # vv can also store vectors of strings
2-element Array{Array{T,1} where T,1}:
 [1, 2, 3]
 ["a", "b", "c"]

julia> push!(vvi, ["a", "b", "c"]) # but vvi cannot
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Int64
Closest candidates are:
  convert(::Type{T}, ::T) where T<:Number at number.jl:6
  convert(::Type{T}, ::Number) where T<:Number at number.jl:7
  convert(::Type{T}, ::Ptr) where T<:Integer at pointer.jl:23
  ...
Stacktrace:
 [1] setindex!(::Array{Int64,1}, ::String, ::Int64) at ./array.jl:847
 [2] _unsafe_copyto!(::Array{Int64,1}, ::Int64, ::Array{String,1}, ::Int64, ::Int64) at ./array.jl:257
 [3] unsafe_copyto! at ./array.jl:311 [inlined]
 [4] _copyto_impl! at ./array.jl:335 [inlined]
 [5] copyto! at ./array.jl:321 [inlined]
 [6] copyto! at ./array.jl:347 [inlined]
 [7] copyto_axcheck! at ./abstractarray.jl:946 [inlined]
 [8] Array at ./array.jl:562 [inlined]
 [9] convert at ./array.jl:554 [inlined]
 [10] push!(::Array{Array{Int64,1},1}, ::Array{String,1}) at ./array.jl:934
 [11] top-level scope at REPL[9]:1

So, vv (my Vector{Vector} above) is more flexible than vvi (my Vector{Vector{Int}}), because the latter can only store Vector{Int} objects. But this flexibility comes at a price — operations on vv will be slower because the compiler can’t determine the types of the elements until runtime (when the elements are actually assigned), so it must pay the price of runtime type-checks and dynamic dispatch.

lmiq · September 15, 2020, 3:33pm

This seems to suggest that tuples could be indicated for use in place of vectors in general numerical calculations, with advantages. From the docs what I understand is that tuples are mostly used to represent function arguments. There has been more than one post comparing tuples to vectors in terms of speed, maybe someone can provide a clarification on why tuples should not in general be used to substitute immutable vectors? Or should they?

phg · September 15, 2020, 3:48pm

That is a completely correct observation, up to a certain size. (SArrays are implemented in terms of tuples.)

Nathan_Boyer · September 15, 2020, 4:10pm

Thank you. Your answer was very insightful. I see now that I need to concretely type both the container and the inner components if I want that speed boost.

I also spent a while exploring the syntax you used to create your vectors as I had never seen it before with type information. Do you usually create them empty like this and then push! to them? I would think that allocating undefined as I did would be better for speed and finding errors, but perhaps there are other benefits to your method.

Vasily_Pisarev · September 15, 2020, 5:50pm

I think, it was for illustrative purposes, as Vector{Int}[] is faster to type than Vector{Vector{Int}}(undef, m).

Another advantage is the guarantee that there would be no uninitialized elements if the end result is built from an empty array and filled by push!.

The best use of Vector{T}(undef, m) is inside some function that then initializes all of the elements, otherwise forgetting to initialize some of them and push!ing new elements may result in undefined references / garbage in the middle.

Nathan_Boyer · September 15, 2020, 7:46pm

Can anyone answer the bonus question?

When should you use Vector v.s. Tuple v.s. SVector as an outer container, and how can you allocate each such that the inner data can be populated later?

danielw2904 · September 15, 2020, 8:14pm

SVectors are immutable and faster for linear algebra operations if they are small (<100 elements according to the github readme). Idk about vectors vs tuples though.

Nathan_Boyer · September 15, 2020, 8:29pm

The GitHub readme that @phg linked says specifically that SVectors are not immutable.

Note that here “statically sized” means that the size can be determined from the type , and “static” does not necessarily imply immutable .

The difference between the three types may boil down to mutable vs. static vs. immutable, but I don’t understand all the nuance there either.

simeonschaub · September 15, 2020, 8:33pm

SArrays are immutable as well as statically sized. StaticArrays.jl also provides MArray for statically sized mutable arrays though.

danielw2904 · September 15, 2020, 8:56pm

See the docs

The package also provides some concrete static array types: SVector , SMatrix and SArray , which may be used as-is (or else embedded in your own type). Mutable versions MVector , MMatrix and MArray are also exported, as well as SizedArray for annotating standard Array s with static size information.

Topic		Replies	Views
Converting Vector to Tuple seems not optimal Performance tuple , memory-allocation	11	444	June 22, 2024
Preferred type expression for a vector of vectors New to Julia	8	1572	August 5, 2018
Vector of Vectors / Vector of Tuples Performance question	6	195	December 3, 2024
Vector{Vector{T} where T} vs Vector{Vector{T}} where T General Usage	18	2500	March 3, 2022
When to use Tuples instead of Vectors? Performance performance , tuple , vector , ntuple	16	344	September 6, 2024

Vector of Vectors: Type Issue and Tuple Question

Related topics