Package for lazy concatenation along arbitrary higher dimensions?

Are there any packages that let you concatenate lazily along arbitrary dimensions like the eager cat does?

julia> mats = [rand(5,6,7,8), rand(5,6,7,8)];

julia> combo = cat(mats..., dims=3); # this, but lazy

julia> size(combo)
(5, 6, 14, 8)

In my case, each array in mats is very large (sometimes not even in memory), but it would be very convenient to be able to concatenate them into one large array lazily.

LazyArrays.jl appears to be the closest, but AFAICT it only supports lazy concatenation along the first or second dimension.

You might be able to express this as a transducer. Checkout GitHub - JuliaFolds/Transducers.jl: Efficient transducers for Julia.
Transducers are expression compositions which you only evaluate once you call collect. Maybe a crude analogy, think of xf = 1:10 as the transducer, and then you call collect(xf) to produce the array. However, transducers allow you to express many transformations, such as map, filter, fold and many more, in this way. And you can compose them with a piping syntax such as 1:10 |> Map(isodd) |> Map(isprime) |> collect, meaning the transducer is xf = 1:10 |> Map(isodd) |> Map(isprime).

And I guess you can just keep appending values, and collect them once you need them.

I hope this helps.

You might be able to express this as a transducer.

Transducers look very interesting and could probably express my problem, but I was hoping to find something that I can use where I could wrap the lazily concatenated arrays in an AxisArray object to use in my analysis as follows

julia> AxisArray(lazycat(mat..., dims=3), axisinfo...)

so that then my current approach would work, only that now getindex would figure out lazily where to get each value on access. Anything that returns a <: AbstractArray should work.

There are quite a few packages which do this. Mine is called LazyStack.jl, but JuliennedArrays, and RecursiveArrayTools, do similar things.

3 Likes

SplitApplyCombine.jl seems an obvious choice here with its combinedimsview function that does exactly what you need.

2 Likes