# Arrays with non-uniform dimensions

I’m looking for a data-structure that essentially works as an array-of-arrays, where each “row” can have a different number of “columns”. The point of that is being able to easily perform map or broadcast, etc, as if it were a multidimensional array. I thought this was possible with `OffsetArrays` but I couldn’t find a reference. Is there any module for that?

Maybe `RaggedArrays.jl`
https://github.com/mbauman/RaggedArrays.jl

1 Like

Just found this thread from a couple of years ago mentioning an idea that actually relates to this topic, “recursive broadcasts”…

In my opinion the best approach would be to have a data-structure that can be easily mapped to/from arrays of arrays, and then support mapping etc (isomorphism? catamorphism? I forgot the correct term). There’s an AbstractTrees module, but I’m not sure how easy is it to use that.

If you just want to apply a single function to an array of arrays, you may consider defining a function having as input an array of arrays. I think something like the following should work:

``````function f(x::T)::T where T <: Real
return x^2
end

function f(x_n::Array{Array{T,1}})::Array{Array{T,1}} where T <: Real
return [f.(x) for x in x_n]
end
``````

Indeed, we can define such a function and get the task done. The point is to be able to do it using the same code you might write for eg arrays. If your data is stored in a regular multi-dimensional tensor, you can `map` or broadcast a function over the data structure, and get a new data structure containing the mapped results. If you now have this different, irregular data-structure, your code written with `map`, comprehensions or broadcasts cannot be used anymore.

What I’m looking for is a way to have a view of this array-of-arrays into a type, say `VectorTree`, and then now we could define `map(x::VectorTree{T})`, like you did. Now we can reuse code written with `map` or broadcasts, etc, and then feed this high-level code either a multidimensional array or a `VectorTree`, and the code still works, it does the right thing depending on what the input type of the data-structure is.

The point is Julia has great support for high-level constructs like `map` as long as you’re using multi-dimensional arrays. What is understandable. I stumbled upon a case of having written my code on top of arrays, and now I wanted to reuse the same code processing an irregular array-of-arrays, but I need to reuse the same `map` functors, etc. And perhaps the biggest question would be how to generalize whatever we implement, such as your function, to an arbitrary number of levels… Maybe just assume the data is a tree with a certain height?

Here’s one proposal for how to do this. The structure merely holds a reference to the array-of-arrays, and the type of the tree leaves. `map` runs recursively until the `eltype` matches the leaf type, and then we apply the function. This should support any tree height, trees with different depths for each branch, and also trees where the leaves are actually `Arrays` that should not be traversed.

``````import Base.map

struct ArrayTree{L,T,N}
a::Array{T,N}
end

function Base.map(f, at::ArrayTree{L,T,N}) where {L,T,N}
ArrayTree{L,T,N}(arraytreemap_(f, at.a, L))
end

function arraytreemap_(f, array, L)
if eltype(array) == L
map(f, array)
else
map(array) do x
arraytreemap_(f, x, L)
end
end
end

myarray = [rand(3), rand(1), rand(2)]
@show myarray
aa = ArrayTree{Float64, Array{Float64, 1}, 1}(myarray)
ma = map(aa) do x x^2+100 end
@show ma.a

# myarray = [[0.27519533394619367, 0.5878159031539618, 0.3940555494438669], [0.07240825460216072], [0.43409689370057825, 0.4750862276744192]]
# ma.a = [[100.07573247182576, 100.3455275360007, 100.15527977604751], [100.00524295533454], [100.18844011312049, 100.22570692372591]]
``````

I created a package GitHub - nlw0/ArrayTrees.jl: ArrayTrees for Julia

map and broadcast work totally fine with arrays and collections containing any types, including inner arrays. They always do the same thing: apply a function to each element of the given collection, whatever those elements are. This is crucial for generic programming!

Recursing while/till some arbitrary type is encountered can also be useful in some cases, but the specific recursion criteria should be clearly spelled or even defined explicitly by the user.
Do you want to recurse into all abstractarrays? If yes, this would go into `SVector`s and `FieldArray`s, that are often treated as atomic objects. If not, then even `view`s of simple arrays won’t be recursed into.
Do you also want to traverse other collections like dicts, sets, and iterators? And so on.

can be directly solved with `Accessors.jl`: `@modify(x -> x^2 + 100, myarray |> Elements() |> Elements())`. `Accessors` also have the `Recursive` selector with arbitrary conditions.
I’m extremely interested in generic programming, that’s the whole point. `map` does what it does on `Arrays` or `AbstractArrays` and there’s absolutely no question of changing this. This would be ludicrous. The question is if you have eg a `Vector{Vector{X}}` and you would like to treat it as a different data structure, with its own semantics, where we can `map` over it, what can you do? `Accessors` seems pretty interesting, but it still does not go all the way from taking the list-of-lists and making it work with `map`. There’s AbstractTrees.jl as well but I’m not sure it offers a solution. The point of the solution I suggested is not whether we are capable of getting the task done, which is trivial, the point is to make it generic and working with `map`. We seem to achieve that. I’d love to hear ideas of other ways we might potentially take a `Vector{Vector{X}}` and run some code with `map` over it, not merely achieve the same effect with eg a double for loop or whatever, but using `map` specifically, which I assume could then be adapted to work with broadcasting as well.