Preallocating Vector of Vectors

Hi there,

I have a project where I need to preallocate a Vector of Vectors. I could just use a multidimensional array, but this complicates things and makes the code less readable as well. Is there a way that I can preallocate a Vector of Vectors as efficiently as a simple multidimensional array? MWE:

#Matrix{Float64}
function makearr1(T, Nrow, Ncol)
        return Matrix{T}(undef, Nrow, Ncol)
end
#Vectors of Vectors{Float64}
function makearr2(T, Nrow, Ncol)
        return [Vector{T}(undef, Nrow) for _ in 1:Ncol]
end
Nrow = 1000
Ncol = 500
makearr1(Float64, Nrow, Ncol)
makearr2(Float64, Nrow, Ncol)

using BenchmarkTools
@btime makearr1($Float64, $Nrow, $Ncol) #3.629 μs (2 allocations: 3.81 MiB)
@btime makearr2($Float64, $Nrow, $Ncol) #163.800 μs (1003 allocations: 3.89 MiB)

It has roughly the same memory consumption, but a lot more computation time and allocations as I need to initialize each individual array. Is there a straightforward solution to have the same number of allocations for makearr2 as for makearr1?

Best regards

Not really, because you want to do separate allocations. Also, if it takes the same time and the same memory, is the number of allocations that important?

Sorry for the confusion - I just swapped the intialization from zeros to an undefined Vector of arbitrary type T. It is quite a big performance drop in this case unfortunately.

You could allocate the matrix and use views (of the columns) of the matrix in your computations.
(or try to preallocate these vectors only once, I assume these allocations are ocurring repeatedly for now).

Thank you for the input!

I thought about the views alternative as well, but this would result in a different type (Subarray instead of standard Vectors) for the Vector of Vector versions. I wish there was some form of reshape!() for that kind of problem where I only initialize all placeholder once and then fill them.

Usually, this will be only preallocated once, but in specific examples this might change, e.g. in a times series Ncol/Nrow might grow over time and I do not know the size a priori.

Would the following work?

function makearr2b(T, Nrow, Ncol)
    v = Vector{T}(undef, Nrow)
    w = Vector{Vector{T}}(undef, Ncol)
    w .= Ref(v)
end

NB: no it does not, as per @mrVeng post below

if you know the total size ahead of time, you can use:

basically you store the data contiguously in-memory, and when you set/get index, you just “view”, this should save a lot of memory in the limit of having a lot of inner vectors

You want something more flexible than a multidimensional array provides. With individual vectors you can grow the vectors separately for example. And as (almost) always, you have to pay for flexibility with performance.

Why is that an issue?

Thank you! Unfortunately, this causes a pointer issue:

c = makearr3(Float64, Nrow, Ncol) #Vector{Vector{Float64}} with 1000 elements
c[1][1] #NaN
c[2][1] #NaN
c[1][1] = 123. #123.0
c[2][1] #123.0, not NaN

1 Like

What about

function makearr2(T, Nrow, Ncol)
    [ Vector{T}(undef, Ncol) for _ in 1:Nrow ]
end

? That doesn’t have the pitfall of using the exact same (===) vector for all indices.


I have to say, I’m also curious - why are views not ok?

This is the same function as the the second function in my post, right? Apologies if I oversee something.

In the actual code, I define functions on the inner array - usually there are already @views involved, and I also want to be sure that the type of the inner array matches the original type for safey reasons.

I will have a look at ArraysOfArrays.jl, and otherwise I just take the additional time as flexibility costs.

It is. The reason it doesn’t get much more concise is because you want to express that you have a container of containers, with each container being

  1. distinct
  2. able to grow/shrink freely without impacting other vectors

A multidimensional matrix doesn’t have those restrictions, so it’s layout/construction can be simpler. Down in the weeds this means that a matrix can just be a single big chunk of memory, because each row/column has the same size as any other, whereas a vector of vectors has disjoint blocks of memory for each vector, to allow each object to live independently.

4 Likes

Thanks! I think I should copy-paste your comment to the relevant struct field so I never forget why I chose that specific Array format :D.

Just to add that, in these cases, use push! to increase the arrays (do not reallocate the whole stuff). push! can be quite efficient, because the vectors are saved with some extra space and pushing new data only needs to do reallocations eventually.

julia> x = [[1,2],[3,4]];

julia> @allocated push!(x[1],3)
48

julia> @allocated push!(x[1],3)
0

julia> @allocated push!(x[1],3)
80

julia> @allocated push!(x[1],3)
0

julia> @allocated push!(x[1],3)
0

julia> @allocated push!(x[1],3)
0

julia> @allocated push!(x[1],3)
144

julia> x
2-element Vector{Vector{Int64}}:
 [1, 2, 3, 3, 3, 3, 3, 3, 3]
 [3, 4]


Thanks
If not mistaken, writing the loop instead of the comprehension seems to be faster & allocate less (Julia 1.7):

function makearr3(T, Nrow, Ncol)
    w = Vector{Vector{T}}(undef, Ncol)
    for i in 1:Ncol
        w[i] =  Vector{T}(undef, Nrow)
    end
    return w
end

I don’t see that:

julia> function makearr3(T, Nrow, Ncol)
           w = Vector{Vector{T}}(undef, Ncol)
           for i in 1:Ncol
               w[i] =  Vector{T}(undef, Nrow)
           end
           return w
       end
makearr3 (generic function with 1 method)

julia> @btime makearr3(Float64, 500, 1000);
  395.621 μs (1491 allocations: 3.98 MiB)

julia> function makearr2(T, Nrow, Ncol)
           [ Vector{T}(undef, Ncol) for _ in 1:Nrow ]
       end
makearr2 (generic function with 1 method)

julia> @btime makearr2(Float64, 500, 1000);
  191.369 μs (1000 allocations: 3.89 MiB)

What does improve the performance is to allow the function to specialize for the type of element being created:

julia> function makearr4(::Val{T}, Nrow, Ncol) where T
           [ Vector{T}(undef, Ncol) for _ in 1:Nrow ]
       end
makearr4 (generic function with 1 method)

julia> @btime makearr4(Val(Float64), 500, 1000);
  81.485 μs (501 allocations: 3.88 MiB)

Odd.
Here running Julia 1.7 on Win10 laptop:

using BenchmarkTools
@btime makearr1(Float64, $Nrow, $Ncol) #   3.9 μs (2 allocations: 3.81 MiB)
@btime makearr2(Float64, $Nrow, $Ncol) # 181.3 μs (1003 allocations: 3.89 MiB)
@btime makearr3(Float64, $Nrow, $Ncol) #  59.8 μs (501 allocations: 3.88 MiB)

makearr2 and makearr3 are not equivalent. One creates Nrow vectors of length Ncol. The other is switched.

There are two different makearr2's in this thread.

2 Likes

Interesting. Here, with 1.7, the loop version is improved (considering the correction in makearray3 pointed above):


julia> @btime makearr2(Float64, 500, 1000);
  195.416 μs (1000 allocations: 3.89 MiB)

julia> @btime makearr3(Float64, 500, 1000);
  83.967 μs (501 allocations: 3.88 MiB)

julia> @btime makearr4(Val(Float64), 500, 1000);
  85.357 μs (501 allocations: 3.88 MiB)


In Julia 1.6 I have:

julia> @btime makearr3(Float64, 500, 1000);
  189.542 μs (1001 allocations: 3.89 MiB)

Thus, the gain associated to the specialization in 1.6 is given in 1.7 for some reason.

1 Like