"invert" nested arrays/collections

Is there an existing library that would transform a vector of some collections (eg Tuple, NamedTuple, Array) to a collection of vectors, recursively?

Eg (note that using fill with the same elements is just for the MWE):

fill((a = (b = 2, c = 3), d = [1,2]), 10)
# would become
(a = (b = fill(2, 10), c = fill(3, 10)), d = fill([1,2], 10))

SplitApplyCombine.invert can do the first layer, but it is not recursive.

Itch scratched:

https://github.com/tpapp/NestedMaps.jl

julia> using NestedMaps

julia> A = [(a = i, b = (c = -i, d = [i^2, -i^2])) for i in 1:3]
3-element Array{NamedTuple{(:a, :b),Tuple{Int64,NamedTuple{(:c, :d),Tuple{Int64,Array{Int64,1}}}}},1}:
 (a = 1, b = (c = -1, d = [1, -1]))
 (a = 2, b = (c = -2, d = [4, -4]))
 (a = 3, b = (c = -3, d = [9, -9]))

julia> nested_map(identity, A)
(a = [1, 2, 3], b = NamedTuple{(:c, :d),Tuple{Int64,Array{Int64,1}}}[(c = -1, d = [1, -1]), (c = -2, d = [4, -4]), (c = -3, d = [9, -9])])

julia> nested_map_recursive(sum, A)
(a = 6, b = (c = -6, d = [14, -14]))

API is WIP.

2 Likes

You could probably also use StructArrays. It supports “unnesting” of Tuples, NamedTuples and custom structs (no vectors though, but StaticVectors could be supported). Here unwrap specifies which “inner” fields should be unnested (defaults to t -> false, so no inner fields are unnested). Here we can ask to only unwrap Tuple and NamedTuple.

julia> using StructArrays

julia> A = [(a = i, b = (c = -i, d = (i^2, -i^2))) for i in 1:3];

julia> unwrap(t) = t <: Union{Tuple, NamedTuple}
unwrap (generic function with 1 method)

julia> s = StructArray(A, unwrap = unwrap);

julia> StructArrays.fieldarrays(s)
(a = [1, 2, 3], b = NamedTuple{(:c, :d),Tuple{Int64,Tuple{Int64,Int64}}}[(c = -1, d = (1, -1)), (c = -2, d = (4, -4)), (c = -3, d = (9, -9))])

julia> s[2]
(a = 2, b = (c = -2, d = (4, -4)))

The main advantage I guess is that it supports custom structs and that the result is an AbstractArray so you don’t really need to keep the original representation. On the minus side, it is built on the assumption that each struct encode its “schema” (type and number of fields) in the types, which is true for Tuple, NamedTuple and Pair or SVector but not for normal arrays so I don’t think it can be used to unnest Vector{Vector}.

The package itself is reasonably tested / mature (IndexedTables uses it for table representation) but horribly under-documented. Will try to remedy to that soon.

3 Likes

This seems very useful (I often want this sort of thing – at least, the non-recursive kind – when broadcasting a tuple-valued function), but I wonder about the name. I would expect something called “map” to preserve the structure and only map values. This operation looks more like a kind of zip or transpose.
If I understand correctly, the non-recursive version with identity is the same as this suggested unzip function:
https://github.com/JuliaLang/julia/issues/13942