# How to "collapse" a Vector of Vector?

``````julia> vv = [[1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0] ]
3-element Array{Array{Float64,1},1}:
[1.0, 2.0]
[3.0, 4.0, 5.0]
[6.0, 7.0]
``````

is there an efficient way to transform it into:

``````[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
``````

?

thanks.

`reduce(vcat, vv)` is a simple way.

A lazy (and hence efficient) option would be `Iterators.flatten(vv)`.

``````julia> @btime reduce(\$vcat, \$vv);
53.095 ns (1 allocation: 144 bytes)

julia> @btime Iterators.flatten(\$vv);
5.800 ns (1 allocation: 16 bytes)

julia> @btime collect(Iterators.flatten(\$vv));
102.331 ns (4 allocations: 224 bytes)
``````

There also is RecursiveArrayTools.jl which might be useful.

7 Likes

Thanks. `reduce()` seems to be quite efficient (and general).

Just curious if it is possible to be even more efficient than the `vvcat()` function defined below?

``````julia> function vvcat(vv)
n = sum(length, vv)
y = Vector{Float64}(undef, n)
position = 1
for i in Base.OneTo(length(vv) )
curr = vv[i]
l = length(curr)
y[position:(position + l - 1) ] = curr
position += l
end
return y
end
vvcat (generic function with 1 method)

julia> @btime vvcat(\$vv)
49.657 ns (1 allocation: 144 bytes)
7-element Array{Float64,1}:
1.0
2.0
3.0
4.0
5.0
6.0
7.0

julia> @btime reduce(vcat, \$vv)
52.273 ns (1 allocation: 144 bytes)
7-element Array{Float64,1}:
1.0
2.0
3.0
4.0
5.0
6.0
7.0
``````

You can stick an `@inbounds` on the for loop to save some time for bounds checking.

Just for fun, I took a crack at further optimization and expanded generality (as I happen to need a function like this at the moment).

``````julia> function vvcat(vv::Vector{Vector{T}})::Vector{T} where T
isempty(vv) && return []
y = Vector{T}(undef, sum(length, vv))
position = 1
for curr in vv
l = length(curr)
@inbounds y[position:(position + l - 1)] = curr
position += l
end
return y
end
vvcat (generic function with 1 method)

julia> vv = [[1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0] ]
3-element Array{Array{Float64,1},1}:
[1.0, 2.0]
[3.0, 4.0, 5.0]
[6.0, 7.0]

julia> @btime vvcat(\$vv);  @btime reduce(vcat, \$vv);
39.858 ns (1 allocation: 144 bytes)
47.065 ns (1 allocation: 144 bytes)
``````

You probably want `isempty(vv) && return T[]` or you might see strange bugs from mixing `Any` arrays with `T` arrays in your program.

Thanks for the suggestion. I was assuming the final convert due to the output typing would take care of that, no? But still, I agree - better to remove the ambiguity.

1 Like

You’re right of course. My bad. I don’t use function result type-annotations very often and glossed right over that.

1 Like

This one is 30% faster on my computer:

``````function vvcat(vv::Vector{Vector{T}}) where {T}
out = Vector{T}(undef, sum(length, vv))
i = 0
for v in vv, x in v
@inbounds out[i+=1] = x
end
return out
end
``````
4 Likes

Oh, very nice! I see the same. I am a Julia newbie, and that double-loop syntax does not yet spring to mind. Now I have some back-porting to do. Thanks!

@DNF, do you agree that the `isempty(vv) && return T[]` is still required? Without it, I get `ERROR: ArgumentError: reducing over an empty collection is not allowed` from the `sum` reduction.

Yeah, I didn’t bother with the empty case, I wanted to focus on the point of handling one sample at the time.

You can either do the test, like you suggest, or you can replace

``````sum(length, vv)
``````

with

``````mapreduce(length, +, vv; init=0)
``````

which initializes the empty case to zero length.

2 Likes

Note that `reduce(vcat ,vv::AbstractVector{<:AbstractVector})` uses an optimized implementation essentially equivalent to `vcat(vv...)` (but which avoids the overhead of recompiling for each new number of entries in `vv`).

3 Likes