# How can I convert a matrix of vectors to a 3d array?

Hello there,

I have a matrix of vectors. The type of the vactors are something non-default. I want to convert to a 3d array. I can see that similar questions have been asked for vectors of matrices, but those answers did not work for me.

Could you please say how I can do it, and explain the logic behind the solution, so I can generalize it to other dimensions in the future?

One way would be:

``````julia> M = [rand(3) for i=1:2,j=1:4]
2Ă—4 Matrix{Vector{Float64}}:
[0.456187, 0.302626, 0.00769471]  [0.217349, 0.599273, 0.922622]  â€¦  [0.335926, 0.450889, 0.760988]
[0.337041, 0.520466, 0.108746]    [0.244348, 0.393246, 0.185435]     [0.77141, 0.882466, 0.125375]

julia> [M[i,j][k] for i=1:2,j=1:4,k=1:3]
2Ă—4Ă—3 Array{Float64, 3}:
[:, :, 1] =
0.456187  0.217349  0.738685  0.335926
0.337041  0.244348  0.921115  0.77141

[:, :, 2] =
0.302626  0.599273  0.633224  0.450889
0.520466  0.393246  0.445012  0.882466

[:, :, 3] =
0.00769471  0.922622  0.87023   0.760988
0.108746    0.185435  0.498452  0.125375
``````

Note that since the comprehension value is `M[i,j][k]`, this expression can be changed simply to get the dimensions in different order.

Also, in terms of sizes in different dimensions, a more generic comprehension could be:

``````[M[i,j][k] for i=axes(M,1),j=axes(M,2),k=eachindex(M[1,1])]
``````

or even:

``````[M[I][k] for I=CartesianIndices(M),k=eachindex(M[1,1])]
``````

In terms of speed, putting the inner index as the first dimension is faster because of memory layout:

``````[M[I][k] for k=eachindex(M[1,1]),I=CartesianIndices(M)]
``````
1 Like

Julia 1.9 will have a command for this, which works for any dimensionality. On earlier versions you can get it from Compat.jl:

``````julia> mv = [[a,b] for a in 1:3, b in 11:14]
3Ă—4 Matrix{Vector{Int64}}:
[1, 11]  [1, 12]  [1, 13]  [1, 14]
[2, 11]  [2, 12]  [2, 13]  [2, 14]
[3, 11]  [3, 12]  [3, 13]  [3, 14]

julia> using Compat

julia> stack(mv)
2Ă—3Ă—4 Array{Int64, 3}:
[:, :, 1] =
1   2   3
11  11  11

[:, :, 2] =
1   2   3
12  12  12

[:, :, 3] =
1   2   3
13  13  13

[:, :, 4] =
1   2   3
14  14  14

julia> size(ans) == (size(first(mv))..., size(mv)...)
true

julia> mt = [(a,b) for a in 1:3, b in 11:14]
3Ă—4 Matrix{Tuple{Int64, Int64}}:
(1, 11)  (1, 12)  (1, 13)  (1, 14)
(2, 11)  (2, 12)  (2, 13)  (2, 14)
(3, 11)  (3, 12)  (3, 13)  (3, 14)

julia> stack(mv) == stack(mt)
true
``````
2 Likes

Thanks a lot boys!

Another option for a pre-`stack` solution, you can use something like:

``````julia> z = reshape([1:3, 4:6, 7:9, 10:12], 2, 2)
2Ă—2 Matrix{UnitRange{Int64}}:
1:3  7:9
4:6  10:12

julia> reshape(reduce(hcat,z), 3, 2, 2)
3Ă—2Ă—2 Array{Int64, 3}:
[:, :, 1] =
1  4
2  5
3  6

[:, :, 2] =
7  10
8  11
9  12
``````

But this is less flexible in the dimension ordering of the output, compared to a comprehension-based method.

Note that this `reduce(hcat,z)` does not hit the fast path, and hence calls `hcat` many times, which is O(N^2). You can do better by moving the `reshape`s aroundâ€¦ but this illustrates why the magic fast path is a bit of a footgun:

``````julia> mv2 = [[a,b] for a in 1:30, b in 11:104];

julia> @time reduce(hcat, mv2);  # notice MB allocation
0.002909 seconds (2.82 k allocations: 15.663 MiB)

julia> @time reduce(hcat, vec(mv2));  # uses the fast path
0.000093 seconds (4 allocations: 44.188 KiB)

julia> @time stack(mv2);
0.000064 seconds (2 allocations: 44.125 KiB)

julia> stack(mv2) == reshape(reduce(hcat, vec(mv2)), 2, 30, 94)
true

julia> @which reduce(hcat, mv2)  # slow pairwise hcat
reduce(op, A::AbstractArray; kw...)
@ Base reducedim.jl:406

julia> @which reduce(hcat, vec(mv2))  # magic fast path
reduce(::typeof(hcat), A::AbstractVector{<:AbstractVecOrMat})
@ Base abstractarray.jl:1709
``````

Good catch. I hate those specializations. They have issues like this and are inconsistent with other reduction functions in `Base`. We have `maximum` for collections rather than `reduce(max,...)`, so why not `concat` or `concatenate` or some other verb for concatenating elements of a collection?

Thatâ€™s what `stack` is.

There might be room for verbs besides `stack`. IMO the logical meaning of `concatenate` is to keep every dimension in place, i.e. for a vector of matrices, concatenate vectors along the 1st dim of the container:

``````julia> concatenate(mv)
6Ă—4 Matrix{Int64}:
1   1   1   1
11  12  13  14
2   2   2   2
11  12  13  14
...
``````

Another candidate is an eager cousin of `Iterators.flatten`:

``````julia> flatten(mv)
24-element Vector{Int64}:
1
11
2
11
3
...
``````
1 Like

But `stack` is less flexible and does all sorts of reshaping along the way.

``````julia> z = [1:3 for _ in 1:10];

julia> stack(z) # fine, probably a reasonable thing to do
3Ă—10 Matrix{Int64}:
1  1  1  1  1  1  1  1  1  1
2  2  2  2  2  2  2  2  2  2
3  3  3  3  3  3  3  3  3  3

julia> stack(z; dims=1) # transposes each element?
10Ă—3 Matrix{Int64}:
1  2  3
1  2  3
1  2  3
1  2  3
1  2  3
1  2  3
1  2  3
1  2  3
1  2  3
1  2  3

julia> stack(z; dims=3)
ERROR: ArgumentError: cannot stack slices ndims(x) = 1 along dims = 3

julia> @btime cat((\$z)...; dims=3);
8.467 ÎĽs (266 allocations: 10.59 KiB)
``````

How do I concatenate the vectors of `z` along the third dimension to produce a 3x1x10 array? My best builtin option is `cat`, but it has horrible performance. My actual best option is this particular case is a `reduce(hcat,z)`, `reduce(vcat,z)`, or `stack(z)` followed by a `reshape`, but reshape games are tedious and annoying to write and awful to read.

I suppose that really this is showing that `stack` should (maybe?) be modified to work along arbitrary dimensions, even those beyond the trailing dimension.

Also, `stack` always inserts a dimension whereas `cat` does not. You can do reshape gymnastics to make all this work, but an iterable-based `cat` wouldnâ€™t require those things.

`[z... ;;;]`

Which lowers to

`hvncat(3, z...)`

``````julia> @btime [z... ;;;]
3.250 ÎĽs (55 allocations: 2.81 KiB)
``````

EDIT: Though curious this is a bit worse than if `z` is an array of arrays vs. ranges; probably some performance gains to be made there: `2.489 ÎĽs (13 allocations: 1.11 KiB)`

The reason this is forbidden is type-stability. Unlike `cat(...; dims)` the return type of `stack` does not depend on the keyword.

Allowing `stack(...; dims=Val(3))` would be the smallest extension. Allowing `stack(...; dims=(3, 2))` might be another way, matching `eachslice(ones(3,1,10); dims=(3,2))`. The issue about this is 47794.

Well, stack usually preserves all dimensions â€“ `axes(result)` is the union of inner & outer. Both the inner elements and the outer container have a 1st axis, and they cannot both end up first. This is rule has very simple indexing, `A[o...][i...] == stack(A)[i..., o...]`. (I say â€śusuallyâ€ť becuase with a keyword, `A[o][i...] = stack(A; dims=1)[o, i...]`, which with multi-dimensional `A` has `axes(vec(A))` instead.)

How do I concatenate the vectors of `z` along the third dimension to produce a 3x1x10 array? â€¦ reshape gymnastics to make all this work, but an iterable-based `cat` wouldnâ€™t require those things.

There are quite a few possible rules for functions combining arrays of arrays into one. Different people seem to think different ways are the only natural extension, and are surprised when the computer does not read their mind. What rules do you see this function obeying, on an M-array of N-arrays? And does it fit with `cat([1 2], [3 4], dims=(1,2))`?

I think the most obvious `concatenate` function would never move an axis, and hence would combine the two 1st axes here. Thatâ€™s one way to generalise the other existing magic fast path, `reduce(vcat, z)`, but wonâ€™t do what you ask.

1 Like

Are there any subtle differences in behavior between `Base.stack` and `SplitApplyCombine.combinedims` to be mindful of? At first glance they seem to produce the same results. The former seems to benchmark slightly faster, and since it doesnâ€™t require an additional import, I will probably start using it instead of the latter. (Though the reverse function `SplitApplyCombine.splitdims` is often very handy, together the â€śviewâ€ť versions of the two aforementioned functions.)

``````julia> using SplitApplyCombine

julia> mv = [[a,b] for a in 1:3, b in 11:14];

julia> z = [1:3 for _ in 1:10];

julia> stack(mv) == combinedims(mv)
true

julia> stack(z) == combinedims(z)
true

julia> stack(z; dims=1) == combinedims(z, 1)
true
``````

Iâ€™d settle for a clone of `cat` that operates on an iterable rather than vararg. Although thereâ€™s also room for a â€ś`blockconcat`â€ť function that concatenates the inner objects along the outer dimensions (i.e., a AxB array of CxDxE arrays becomes a ACxBDxE array, or otherwise works so long the inner objects form a valid grid).

We certainly wonâ€™t cover every concatenation paradigm an adversary could conceive, but currently weâ€™re missing (what I would consider) some of the most common ones. Rather, in many cases we have them but only in vararg form.

`reduce(hcat,...)` and friends are bad idioms because 1) they are intuitively (i.e., without knowledge of the specializations) the worst way to perform this operation and 2) the specializations miss many cases (like my earlier recommendation that somebody pointed out missed the fastpath) so you go from â€śbestâ€ť to â€śworstâ€ť performance unexpectedly.