Are there any packages that let you concatenate lazily along arbitrary dimensions like the eager
julia> mats = [rand(5,6,7,8), rand(5,6,7,8)];
julia> combo = cat(mats..., dims=3); # this, but lazy
(5, 6, 14, 8)
In my case, each array in
mats is very large (sometimes not even in memory), but it would be very convenient to be able to concatenate them into one large array lazily.
LazyArrays.jl appears to be the closest, but AFAICT it only supports lazy concatenation along the first or second dimension.
You might be able to express this as a transducer. Checkout GitHub - JuliaFolds/Transducers.jl: Efficient transducers for Julia.
Transducers are expression compositions which you only evaluate once you call
collect. Maybe a crude analogy, think of
xf = 1:10 as the transducer, and then you call
collect(xf) to produce the array. However, transducers allow you to express many transformations, such as
fold and many more, in this way. And you can compose them with a piping syntax such as
1:10 |> Map(isodd) |> Map(isprime) |> collect, meaning the transducer is
xf = 1:10 |> Map(isodd) |> Map(isprime).
And I guess you can just keep appending values, and collect them once you need them.
I hope this helps.
You might be able to express this as a transducer.
Transducers look very interesting and could probably express my problem, but I was hoping to find something that I can use where I could wrap the lazily concatenated arrays in an AxisArray object to use in my analysis as follows
julia> AxisArray(lazycat(mat..., dims=3), axisinfo...)
so that then my current approach would work, only that now
getindex would figure out lazily where to get each value on access. Anything that returns a
<: AbstractArray should work.
There are quite a few packages which do this. Mine is called LazyStack.jl, but JuliennedArrays, and RecursiveArrayTools, do similar things.
SplitApplyCombine.jl seems an obvious choice here with its
combinedimsview function that does exactly what you need.