julia> vv = [[1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0] ]
3-element Array{Array{Float64,1},1}:
[1.0, 2.0]
[3.0, 4.0, 5.0]
[6.0, 7.0]
is there an efficient way to transform it into:
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
?
thanks.
julia> vv = [[1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0] ]
3-element Array{Array{Float64,1},1}:
[1.0, 2.0]
[3.0, 4.0, 5.0]
[6.0, 7.0]
is there an efficient way to transform it into:
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
?
thanks.
reduce(vcat, vv)
is a simple way.
A lazy (and hence efficient) option would be Iterators.flatten(vv)
.
julia> @btime reduce($vcat, $vv);
53.095 ns (1 allocation: 144 bytes)
julia> @btime Iterators.flatten($vv);
5.800 ns (1 allocation: 16 bytes)
julia> @btime collect(Iterators.flatten($vv));
102.331 ns (4 allocations: 224 bytes)
There also is RecursiveArrayTools.jl which might be useful.
Thanks. reduce()
seems to be quite efficient (and general).
Just curious if it is possible to be even more efficient than the vvcat()
function defined below?
julia> function vvcat(vv)
n = sum(length, vv)
y = Vector{Float64}(undef, n)
position = 1
for i in Base.OneTo(length(vv) )
curr = vv[i]
l = length(curr)
y[position:(position + l - 1) ] = curr
position += l
end
return y
end
vvcat (generic function with 1 method)
julia> @btime vvcat($vv)
49.657 ns (1 allocation: 144 bytes)
7-element Array{Float64,1}:
1.0
2.0
3.0
4.0
5.0
6.0
7.0
julia> @btime reduce(vcat, $vv)
52.273 ns (1 allocation: 144 bytes)
7-element Array{Float64,1}:
1.0
2.0
3.0
4.0
5.0
6.0
7.0
You can stick an @inbounds
on the for loop to save some time for bounds checking.
Just for fun, I took a crack at further optimization and expanded generality (as I happen to need a function like this at the moment).
julia> function vvcat(vv::Vector{Vector{T}})::Vector{T} where T
isempty(vv) && return []
y = Vector{T}(undef, sum(length, vv))
position = 1
for curr in vv
l = length(curr)
@inbounds y[position:(position + l - 1)] = curr
position += l
end
return y
end
vvcat (generic function with 1 method)
julia> vv = [[1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0] ]
3-element Array{Array{Float64,1},1}:
[1.0, 2.0]
[3.0, 4.0, 5.0]
[6.0, 7.0]
julia> @btime vvcat($vv); @btime reduce(vcat, $vv);
39.858 ns (1 allocation: 144 bytes)
47.065 ns (1 allocation: 144 bytes)
You probably want isempty(vv) && return T[]
or you might see strange bugs from mixing Any
arrays with T
arrays in your program.
Thanks for the suggestion. I was assuming the final convert due to the output typing would take care of that, no? But still, I agree - better to remove the ambiguity.
You’re right of course. My bad. I don’t use function result type-annotations very often and glossed right over that.
This one is 30% faster on my computer:
function vvcat(vv::Vector{Vector{T}}) where {T}
out = Vector{T}(undef, sum(length, vv))
i = 0
for v in vv, x in v
@inbounds out[i+=1] = x
end
return out
end
Oh, very nice! I see the same. I am a Julia newbie, and that double-loop syntax does not yet spring to mind. Now I have some back-porting to do. Thanks!
@DNF, do you agree that the isempty(vv) && return T[]
is still required? Without it, I get ERROR: ArgumentError: reducing over an empty collection is not allowed
from the sum
reduction.
Yeah, I didn’t bother with the empty case, I wanted to focus on the point of handling one sample at the time.
You can either do the test, like you suggest, or you can replace
sum(length, vv)
with
mapreduce(length, +, vv; init=0)
which initializes the empty case to zero length.
Note that reduce(vcat ,vv::AbstractVector{<:AbstractVector})
uses an optimized implementation essentially equivalent to vcat(vv...)
(but which avoids the overhead of recompiling for each new number of entries in vv
).