# Convert a 3D Array to 2D array of vectors

Hi,

I have a 3D array (`X::Array{Int8, 3}`, size `N x K x 2`). How can I convert it to a Matrix (size `N x K`) of `Vector` such that new `X` will be `X[i, j] = X[i, j, :]` with very little new memory allocation.

Thanks,
Vishnu Raj

Maybe:

``````map(CartesianIndices(axes(vol)[1:2])) do i
vol[Tuple(i)..., :]
end
``````

You can also use views, if that works for your use case (`view(vol, Tuple(i)..., :)`)

Btw, that sounds like a job for StaticArrays:

``````using StaticArrays
vol = rand(1000, 1000, 2)
function test(vol)
ax = axes(vol)
map(CartesianIndices((ax[1], ax[2]))) do i
x, y = Tuple(i)
SVector(ntuple(z-> vol[x, y, z], 2))
end
end
``````

Which should allocate much less and be much more gc friendly

1 Like

@sdanisch, that is pretty advanced code and hard to grasp

An alternative herein, using TensorCast.jl, which seems to use the same â€śblack box technologyâ€ť under the hood as StaticArrays, but thanks to a very logical and accessible syntax, one can get equivalent results in a more user-friendly way:

``````using TensorCast
@cast out2[i,j]{k} := vol[i,j,k]
``````

NB: curly brackets allow different interpretation slices of the same memory, and assignment `:=` returns views

Note that if you can cheat a bit (donâ€™t need a `Matrix`, but an `AbstractMatrix` with the same shape will do), you can use essentially no memory at all with StructArrays + StaticArrays

``````julia> using StaticArrays, StructArrays, BenchmarkTools

julia> function to_structarray(X::AbstractArray{T, N}) where {T, N}
return StructArray{SVector{2, T}}(X, dims=N)
end
to_structarray (generic function with 1 method)

julia> X = rand(Int8, 1000, 1000, 2);

julia> @btime S = to_structarray(\$X)
540.741 ns (8 allocations: 368 bytes)
1000Ă—1000 StructArray(view(::Array{Int8, 3}, :, :, 1), view(::Array{Int8, 3}, :, :, 2)) with eltype SVector{2, Int8}:
[-122, 88]    [95, 113]    [-66, 65]    [37, -3]      â€¦  [73, -90]    [-50, -98]   [37, 5]       [78, -51]
[-126, -16]   [-54, 5]     [94, 38]     [78, 84]         [-23, -34]   [24, -15]    [102, 50]     [-122, -62]
â‹®                                                    â‹±
[-76, -88]   [78, 20]      [91, -34]    [26, 52]        [-117, 58]    [-60, -95]   [-125, -35]   [-60, 68]
[7, 54]      [66, 78]      [68, -61]    [66, -83]       [93, -123]    [-124, -47]  [15, 106]     [2, -72]
``````

If you want a concrete `Matrix` you can do `collect(S)`, but depending on your application you may not need to.

2 Likes

Thank you for all your answers. I tried to benchmark the solutions (code below).
The results were for a `500 x 500 x 2` 3D Array.

``````Using StaticArray, by @sdanisch                 :    4.280 ÎĽs (1 allocation: 5.06 KiB)
Using TensorCast, by @rafael.guerra             :  515.948 ns (6 allocations: 288 bytes)
Using StaticArrays and StructArrays, by @piever :  403.785 ns (8 allocations: 368 bytes)
``````

It seems using TensorCast as suggested by @rafael.guerra gives the best.

Code used:

``````using BenchmarkTools

# Using StaticArray, by @sdanisch
using StaticArrays

function method1(X)
ax = axes(X)

map(CartesianIndices((ax[1], ax[2]))) do i
x, y = Tuple(i)
SVector(ntuple(z -> X[x, y, z], 2))
end
end

# Using TensorCast, by @rafael.guerra
using TensorCast

function method2(X)
@cast out[i, j]{k} := X[i, j, k]
end

# Using StaticArrays and StructArrays, by @piever
using StaticArrays, StructArrays

function method3(X::AbstractArray{T, N}) where {T, N}
return StructArray{SVector{2, T}}(X, dims=N)
end

### Benchmarking
X = rand(Int8, 50, 50, 2);

@btime X1 = method1(\$X);

@btime X2 = method2(\$X);

@btime X3 = method3(\$X);
``````

Note that if you go for the TensorCast solution (which IIUC under the hood uses `reinterpret` to reinterpret a given block of memory as a matrix of `SVector`s) you should try to have your data in the `2 x N x K` rather than `N x K x 2` format. Then you can just do `reinterpret(reshape, SVector{2, Int8}, X)` to get your matrix, as each `SVector` is stored contiguously in the array as `X[:, j, k]`.

The `N x K x 2` format is a â€śstruct of arraysâ€ť format, where the first component of all the arrays `X[:, :, 1]` is stored contiguously, and so is the second component `X[:, :, 2]`.

To see what I mean in practice, experiment benchmarking functions that use the resulting arrays. For example `@btime sum(\$X2)` (where `X2` was computed with TensorCast) should be substantially slower than `@btime sum(\$X1)` or `@btime sum(\$X3)` (which hopefully should be very similar).

2 Likes

This is correct. It applies this to something like a `PermutedDimsArray` when given the `N x K x 2` format, Iâ€™m somewhat surprised this works in fact, and would not expect it to be particularly fast. The simple re-interpretation is things like `@cast out3[j,k]{i} := vol3[i,j,k]`, making SVectors from the first dimension.

I 'm not sure it knows about ` reinterpret(reshape, ...` (which was new in Julia 1.6), so it inserts more reshapes than ideal:

``````julia> @cast out2[i,j]{k} := vol[i,j,k]; summary(out2)
"1000Ă—1000 reshape(reinterpret(StaticArrays.SVector{2, Float64}, reshape(transmute(::Array{Float64, 3}, (3, 1, 2)), 2000000)), 1000, 1000) with eltype StaticArrays.SVector{2, Float64}"
``````
2 Likes