Convert a 3D Array to 2D array of vectors

v-i-s-h · August 3, 2021, 5:17pm

Hi,

I have a 3D array (X::Array{Int8, 3}, size N x K x 2). How can I convert it to a Matrix (size N x K) of Vector such that new X will be X[i, j] = X[i, j, :] with very little new memory allocation.

Thanks,
Vishnu Raj

sdanisch · August 3, 2021, 5:30pm

Maybe:

map(CartesianIndices(axes(vol)[1:2])) do i 
    vol[Tuple(i)..., :]
end

You can also use views, if that works for your use case (view(vol, Tuple(i)..., :))

sdanisch · August 3, 2021, 5:45pm

Btw, that sounds like a job for StaticArrays:

using StaticArrays
vol = rand(1000, 1000, 2)
function test(vol)
    ax = axes(vol)
    map(CartesianIndices((ax[1], ax[2]))) do i
        x, y = Tuple(i)
        SVector(ntuple(z-> vol[x, y, z], 2))
    end
end

Which should allocate much less and be much more gc friendly

rafael.guerra · August 3, 2021, 7:12pm

@sdanisch, that is pretty advanced code and hard to grasp

An alternative herein, using TensorCast.jl, which seems to use the same “black box technology” under the hood as StaticArrays, but thanks to a very logical and accessible syntax, one can get equivalent results in a more user-friendly way:

using TensorCast
@cast out2[i,j]{k} := vol[i,j,k]

NB: curly brackets allow different interpretation slices of the same memory, and assignment := returns views

piever · August 3, 2021, 7:17pm

Note that if you can cheat a bit (don’t need a Matrix, but an AbstractMatrix with the same shape will do), you can use essentially no memory at all with StructArrays + StaticArrays

julia> using StaticArrays, StructArrays, BenchmarkTools

julia> function to_structarray(X::AbstractArray{T, N}) where {T, N}
           return StructArray{SVector{2, T}}(X, dims=N)
       end
to_structarray (generic function with 1 method)

julia> X = rand(Int8, 1000, 1000, 2);

julia> @btime S = to_structarray($X)
  540.741 ns (8 allocations: 368 bytes)
1000×1000 StructArray(view(::Array{Int8, 3}, :, :, 1), view(::Array{Int8, 3}, :, :, 2)) with eltype SVector{2, Int8}:
 [-122, 88]    [95, 113]    [-66, 65]    [37, -3]      …  [73, -90]    [-50, -98]   [37, 5]       [78, -51]
 [-126, -16]   [-54, 5]     [94, 38]     [78, 84]         [-23, -34]   [24, -15]    [102, 50]     [-122, -62]
 ⋮                                                    ⋱
 [-76, -88]   [78, 20]      [91, -34]    [26, 52]        [-117, 58]    [-60, -95]   [-125, -35]   [-60, 68]
 [7, 54]      [66, 78]      [68, -61]    [66, -83]       [93, -123]    [-124, -47]  [15, 106]     [2, -72]

If you want a concrete Matrix you can do collect(S), but depending on your application you may not need to.

v-i-s-h · August 4, 2021, 7:07am

Thank you for all your answers. I tried to benchmark the solutions (code below).
The results were for a 500 x 500 x 2 3D Array.

Using StaticArray, by @sdanisch                 :    4.280 μs (1 allocation: 5.06 KiB)
Using TensorCast, by @rafael.guerra             :  515.948 ns (6 allocations: 288 bytes)
Using StaticArrays and StructArrays, by @piever :  403.785 ns (8 allocations: 368 bytes)

It seems using TensorCast as suggested by @rafael.guerra gives the best.

Code used:

using BenchmarkTools

# Using StaticArray, by @sdanisch
using StaticArrays

function method1(X)
    ax = axes(X)

    map(CartesianIndices((ax[1], ax[2]))) do i
        x, y = Tuple(i)
        SVector(ntuple(z -> X[x, y, z], 2))
    end
end


# Using TensorCast, by @rafael.guerra
using TensorCast

function method2(X)
    @cast out[i, j]{k} := X[i, j, k]
end


# Using StaticArrays and StructArrays, by @piever
using StaticArrays, StructArrays

function method3(X::AbstractArray{T, N}) where {T, N}
    return StructArray{SVector{2, T}}(X, dims=N)
end

### Benchmarking
X = rand(Int8, 50, 50, 2);

@btime X1 = method1($X);

@btime X2 = method2($X);

@btime X3 = method3($X);

piever · August 4, 2021, 8:41am

Note that if you go for the TensorCast solution (which IIUC under the hood uses reinterpret to reinterpret a given block of memory as a matrix of SVectors) you should try to have your data in the 2 x N x K rather than N x K x 2 format. Then you can just do reinterpret(reshape, SVector{2, Int8}, X) to get your matrix, as each SVector is stored contiguously in the array as X[:, j, k].

The N x K x 2 format is a “struct of arrays” format, where the first component of all the arrays X[:, :, 1] is stored contiguously, and so is the second component X[:, :, 2].

To see what I mean in practice, experiment benchmarking functions that use the resulting arrays. For example @btime sum($X2) (where X2 was computed with TensorCast) should be substantially slower than @btime sum($X1) or @btime sum($X3) (which hopefully should be very similar).

mcabbott · August 4, 2021, 1:34pm

This is correct. It applies this to something like a PermutedDimsArray when given the N x K x 2 format, I’m somewhat surprised this works in fact, and would not expect it to be particularly fast. The simple re-interpretation is things like @cast out3[j,k]{i} := vol3[i,j,k], making SVectors from the first dimension.

I 'm not sure it knows about reinterpret(reshape, ... (which was new in Julia 1.6), so it inserts more reshapes than ideal:

julia> @cast out2[i,j]{k} := vol[i,j,k]; summary(out2)
"1000×1000 reshape(reinterpret(StaticArrays.SVector{2, Float64}, reshape(transmute(::Array{Float64, 3}, (3, 1, 2)), 2000000)), 1000, 1000) with eltype StaticArrays.SVector{2, Float64}"

Topic		Replies	Views
Efficient vectorized matrix operations for 2D matrix slices of a 3D array? New to Julia question	11	2013	May 8, 2019
Array of StaticArrays to 2D array New to Julia	1	620	October 11, 2019
Reinterpret a Matrix as a Vector of SVectors New to Julia question	2	952	September 2, 2018
Operations on `StaticArrays` without intermediate allocation General Usage	9	1266	June 13, 2017
What is the fastest way to turn a 3D array into a vector of 2D arrays? Performance array , reshaping	13	2340	August 3, 2021

Convert a 3D Array to 2D array of vectors

Related topics