Package for lazy concatenation along arbitrary higher dimensions?

tlnagy · February 16, 2021, 7:15pm

Are there any packages that let you concatenate lazily along arbitrary dimensions like the eager cat does?

julia> mats = [rand(5,6,7,8), rand(5,6,7,8)];

julia> combo = cat(mats..., dims=3); # this, but lazy

julia> size(combo)
(5, 6, 14, 8)

In my case, each array in mats is very large (sometimes not even in memory), but it would be very convenient to be able to concatenate them into one large array lazily.

LazyArrays.jl appears to be the closest, but AFAICT it only supports lazy concatenation along the first or second dimension.

sunbubble · February 16, 2021, 7:48pm

You might be able to express this as a transducer. Checkout GitHub - JuliaFolds/Transducers.jl: Efficient transducers for Julia.
Transducers are expression compositions which you only evaluate once you call collect. Maybe a crude analogy, think of xf = 1:10 as the transducer, and then you call collect(xf) to produce the array. However, transducers allow you to express many transformations, such as map, filter, fold and many more, in this way. And you can compose them with a piping syntax such as 1:10 |> Map(isodd) |> Map(isprime) |> collect, meaning the transducer is xf = 1:10 |> Map(isodd) |> Map(isprime).

And I guess you can just keep appending values, and collect them once you need them.

I hope this helps.

tlnagy · February 16, 2021, 8:30pm

You might be able to express this as a transducer.

Transducers look very interesting and could probably express my problem, but I was hoping to find something that I can use where I could wrap the lazily concatenated arrays in an AxisArray object to use in my analysis as follows

julia> AxisArray(lazycat(mat..., dims=3), axisinfo...)

so that then my current approach would work, only that now getindex would figure out lazily where to get each value on access. Anything that returns a <: AbstractArray should work.

mcabbott · February 16, 2021, 9:02pm

There are quite a few packages which do this. Mine is called LazyStack.jl, but JuliennedArrays, and RecursiveArrayTools, do similar things.

aplavin · February 17, 2021, 7:17am

SplitApplyCombine.jl seems an obvious choice here with its combinedimsview function that does exactly what you need.

Topic		Replies	Views
Package for lazy hcat/vcat of a large number of vectors General Usage question	9	563	September 1, 2022
Concatenating 2D generator of arrays New to Julia splitapplycombine	1	454	March 31, 2022
Fastest way to concatenate many arrays along existing axis? General Usage linearalgebra , arrays	5	426	June 19, 2024
Lazy vcat of matrices just before multiplication Performance	3	235	April 2, 2024
Transforming an Array{Matrix{T, N}, M} to an Array{T, N+M} New to Julia arrays , splitapplycombine	10	942	July 24, 2021

Package for lazy concatenation along arbitrary higher dimensions?

Related topics