How to "collapse" a Vector of Vector?

tomtom · January 2, 2020, 3:05pm

julia> vv = [[1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0] ]
3-element Array{Array{Float64,1},1}:
 [1.0, 2.0]     
 [3.0, 4.0, 5.0]
 [6.0, 7.0]

is there an efficient way to transform it into:

[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]

?

thanks.

carstenbauer · January 2, 2020, 3:06pm

reduce(vcat, vv) is a simple way.

A lazy (and hence efficient) option would be Iterators.flatten(vv).

julia> @btime reduce($vcat, $vv);
  53.095 ns (1 allocation: 144 bytes)

julia> @btime Iterators.flatten($vv);
  5.800 ns (1 allocation: 16 bytes)

julia> @btime collect(Iterators.flatten($vv));
  102.331 ns (4 allocations: 224 bytes)

There also is RecursiveArrayTools.jl which might be useful.

tomtom · January 2, 2020, 3:20pm

Thanks. reduce() seems to be quite efficient (and general).

Just curious if it is possible to be even more efficient than the vvcat() function defined below?

julia> function vvcat(vv)
           n = sum(length, vv)
           y = Vector{Float64}(undef, n)
           position = 1
           for i in Base.OneTo(length(vv) )
               curr = vv[i]
               l = length(curr)
               y[position:(position + l - 1) ] = curr
               position += l
           end
           return y
       end
vvcat (generic function with 1 method)

julia> @btime vvcat($vv)
  49.657 ns (1 allocation: 144 bytes)
7-element Array{Float64,1}:
 1.0
 2.0
 3.0
 4.0
 5.0
 6.0
 7.0

julia> @btime reduce(vcat, $vv)
  52.273 ns (1 allocation: 144 bytes)
7-element Array{Float64,1}:
 1.0
 2.0
 3.0
 4.0
 5.0
 6.0
 7.0

GunnarFarneback · January 2, 2020, 3:36pm

You can stick an @inbounds on the for loop to save some time for bounds checking.

tro3 · January 2, 2020, 4:19pm

Just for fun, I took a crack at further optimization and expanded generality (as I happen to need a function like this at the moment).

julia> function vvcat(vv::Vector{Vector{T}})::Vector{T} where T
           isempty(vv) && return []
           y = Vector{T}(undef, sum(length, vv))
           position = 1
           for curr in vv
               l = length(curr)
               @inbounds y[position:(position + l - 1)] = curr
               position += l
           end
           return y
       end
vvcat (generic function with 1 method)

julia> vv = [[1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0] ]
3-element Array{Array{Float64,1},1}:
 [1.0, 2.0]
 [3.0, 4.0, 5.0]
 [6.0, 7.0]

julia> @btime vvcat($vv);  @btime reduce(vcat, $vv);
  39.858 ns (1 allocation: 144 bytes)
  47.065 ns (1 allocation: 144 bytes)

non-Jedi · January 2, 2020, 4:26pm

You probably want isempty(vv) && return T[] or you might see strange bugs from mixing Any arrays with T arrays in your program.

tro3 · January 2, 2020, 4:39pm

Thanks for the suggestion. I was assuming the final convert due to the output typing would take care of that, no? But still, I agree - better to remove the ambiguity.

non-Jedi · January 2, 2020, 4:40pm

You’re right of course. My bad. I don’t use function result type-annotations very often and glossed right over that.

DNF · January 2, 2020, 6:33pm

This one is 30% faster on my computer:

function vvcat(vv::Vector{Vector{T}}) where {T}
    out = Vector{T}(undef, sum(length, vv))
    i = 0
    for v in vv, x in v
       @inbounds out[i+=1] = x
    end
    return out
end

tro3 · January 2, 2020, 6:36pm

Oh, very nice! I see the same. I am a Julia newbie, and that double-loop syntax does not yet spring to mind. Now I have some back-porting to do. Thanks!

tro3 · January 2, 2020, 9:45pm

@DNF, do you agree that the isempty(vv) && return T[] is still required? Without it, I get ERROR: ArgumentError: reducing over an empty collection is not allowed from the sum reduction.

DNF · January 2, 2020, 10:50pm

Yeah, I didn’t bother with the empty case, I wanted to focus on the point of handling one sample at the time.

You can either do the test, like you suggest, or you can replace

sum(length, vv)

with

mapreduce(length, +, vv; init=0)

which initializes the empty case to zero length.

nalimilan · January 5, 2020, 2:36pm

Note that reduce(vcat ,vv::AbstractVector{<:AbstractVector}) uses an optimized implementation essentially equivalent to vcat(vv...) (but which avoids the overhead of recompiling for each new number of entries in vv).

Topic		Replies	Views
Flattening a vector of vectors New to Julia	3	2699	December 9, 2022
Vector of Vectors with different size to one Vector General Usage vector	9	4230	June 2, 2020
How to efficiently re-arrange a vector of vectors into a matrix? New to Julia array	5	5414	January 6, 2022
Flatten to a vector General Usage question , array , data_structures	26	16337	February 19, 2023
How to get vector from set of ranges? General Usage	7	620	May 24, 2022

How to "collapse" a Vector of Vector?

Related topics