Package for lazy hcat/vcat of a large number of vectors

I am looking for the lazy equivalent of

v = [rand(500) for _ in 1:5000];
mapreduce(permutedims, vcat, v)

where both dimensions may be large (up 10⁵). This is trivial to code up but I want to avoid duplication.

Is there a package which has such functionality?

I thought of LazyArrays.jl but it only has vararg syntax, cf

This one from RecursiveArrayTools usually works quite well:

https://recursivearraytools.sciml.ai/stable/array_types/#RecursiveArrayTools.VectorOfArray

2 Likes

Thanks! After I pressed submit, I also realized that

using JuliennedArrays
Align(v, False(), True())

works too.

1 Like

SplitApplyCombine.jl has functions specifically for this, both lazy and eager.

julia> using SplitApplyCombine

# eager - combinedims
julia> combinedims(v)
500×5000 Matrix{Float64}:
...

julia> combinedims(v, 1)
5000×500 Matrix{Float64}:
...

# lazy - just change to combinedimsview

The inverse operation is there as well - splitdims.

2 Likes

You could also use TensorCast, which has the most intuitive syntax:

using TensorCast
@cast m[i,j] := v[i][j]  
2 Likes

This lead me to LazyArrays.stack, which I ended up using.

Thanks for all the great replies.

A related question: what if the elements are matrices, as in

v = [rand(500, 50) for _ in 1:5];
reduce(vcat, v) # need lazy version

To combine two indices into one, you can use TensorCast’s operator ⊗:

@cast m[j⊗i, k] := v[i][j,k]  (i in 1:5) 
2 Likes

There is also SentinelArrays.ChainedVector type specifically for vectors. It uses a vector internally, so won’t have the same StackOverflow problem.

Exactly the same solution with combinedimsview (: