Very best way to concatenate an array of arrays


I’ve been wondering about this for a while. Suppose I want to concatenate arrays of arrays. Currently the only way I know of doing that is vcat(A...) (or could be hcat, whatever). My understanding is that this is doubly inefficient, first because it has to allocate a new array (unavoidable), but second because it has to recompile vcat for every number of arguments and every element type of every input array A. Therefore, if you want to do this lots of times on lots of different arrays you are really screwed. I suppose I could just write a function that takes a Vector{<:AbstractVector} and allocates the new array properly, but that seems silly.

What is the very best way of doing this?

using RecursiveArrayTools
VA = VectorOfArray(A)
arr = convert(Array,VA)

VectorOfArray is a lazy representation of the array of arrays as a higher order tensor (/matrix if A is a vector of vectors). It uses the indexing fallback to do the conversion. Honestly, I tried writing my own loop for the conversion but the fallback was faster (Tim Holy Cartesian magic?) so this is what I’ve found to be the fastest, and I don’t know how to recreate it without using the lazy matrix indexing.


Thanks, RecursiveArrayTools looks useful.

I was doing this with a little function in Flux.jl, and a quick test shows it’s sometimes quicker:

A100 = [rand(100) for i=1:100];
using BenchmarkTools

using RecursiveArrayTools
@btime convert(Array,VectorOfArray(A100));  # 25.571 μs

using Flux: batch
@btime batch(A100); # 10.686 μs

maximum( convert(Array,VectorOfArray(A100)) - batch(A100) ) # zero

But for [rand(10) for i=1:1000] this is reversed.


I was stupid and thought vcat with single arg would do the job. It doesn’t.

Whether to be lazy or not kinda depends on how often you plan to access the new array, how much memory you have available and how cache-friendly your planned access is.


vcat(V) just returns V.

Wow, ok I was kind of expecting that by now there’d be a really straightforward answer to this. It’s probably too late now, but shouldn’t there be something in Base that does this? I’m thinking of a function that simply allocates an array with elements type the union of the arguments. It would have to do inefficient allocation, but at least it wouldn’t have the compile penalties of vcat.


See discussion here:

I think the consensus was to implement an efficient reduce(vcat, v) but I think it hasn’t been done yet. It is a performance improvement so should be non-breaking.


There seems to be a fairly common misunderstanding about how compatibility works and what is “too late” after 1.0 is released: you can add new features, including new exported names in 1.x since that does not break older code. What you cannot do is remove or change existing features in a 1.x version.


Note also that a faster reduce(vcat, arrays) method can be added at any time, regardless of feature freezes, since it is just an optimization.


In a slightly more general setting, I have used the following to map and concat an array of array.

mapconcat1(f, v) = [z for x in v for z in f(x)]
mapconcat2(f, v) = vcat(f.(v)...)
mapconcat3(f,v) = mapfoldl(f, append!, [], v)

The first one allocates more but in some cases it was faster than the other two. I guess it depends on the shape of the data.


Maybe I’m missing the intent of the question, but I normally write an ugly function similar to this:

function vec2array(v)
    r = length(v)
    c = length(v[1])
    a = Array{Int64}(r,c)
    for i in 1:r, j in 1:c
        a[i,j] = v[i][j]


In Common Lisp, the equivalent of

foo(args...) = args
a = [1,2,3]
foo(a...) === a

is true. It’s a very good optimization. I wonder how breaking it would be to sometimes pass varargs as arrays instead of tuples (and how complicated it would be to optimize in a language with multiple dispatch).


I don’t think we’d even need anything like that would we? All I had in mind when posting this thread was something roughly like @sdmcallister suggested being in Base. The only complication I can think of is determining the appropriate Union type for the resulting array. I suppose we’d have to be consistent with the typing behavior of vcat.

By the way, as far as I know even StaticArrays are basically just tuples under the hood, so that would seem to suggest that yes, arguments really do need to be passed as tuples.