Most efficient way to stack vectors on the fourth dimension


let x=[1 2; 3 4] and y=copy(x) I want to stack the two arrays together on the 4th dimension.

E.g. xy = cat(x, y, dims=4) will get what I wanted

However the above is fine but when you apply the same technique to a large data such as the MNIST dataset the speed is really slow e.g.

using Flux, Flux.Data.MNIST
imgs = MNIST.images()
X = cat(float.(imgs)..., dims = 4) # SLOW!!!

What’s the most efficient code to create X? (which is done by stacking 60,000 28x28 matrices together on the 4th dimension.)


I assume you want to stack 3-dimensional arrays (and not matrices) along dims=4.

I would permute dimensions (permutedims), flatten (vec), then use reduce(hcat, ...) which is optimized, and finally reshape and permutedims back.

function my_cat(X)
	k = length(X)
	k == 0 && return nothing

	s = size(X[1])
	T = eltype(X[1])
	@assert length(s) == 2
	@assert all(x -> s == size(x), X)
	@assert all(x -> T == eltype(x), X)

	n = prod(s)
	Y_vec = Vector{T}(undef, k*n)
	offset = 0
	for x in X
		@inbounds Y_vec[offset + 1: offset + length(x)] .= vec(x)
		offset += length(x)
	Y = reshape(Y_vec, size(X[1])..., 1, k)
	return Y

x = [1 2; 3 4];
y = copy(x);
X = [x, y];


Correct. Because it’s image data with height width and colour channels; that makes 3 dims. But the data is just grey images so the colourchannel dim disappears

ANN: TensorSlice.jl → TensorCast.jl

I made a package for this:

@cast stacked[i,j,_,n] := xlist[n][i,j] |> float

Assuming xlist = [x,y] are your images, and you really did mean 4th dimension not 3rd.

It’s just doing something like reduce(hcat,…), but the point is not to have to add a comment # axes are row,col,colour,number or a docstring to the function you write.

Edit: actually that’s quite slow on all 60000. But this is fast: lazy uses RecursiveArrayTools to make a lazy array, instead of reduce(cat,...).

@cast stacked[i,j,_,n] := float( imgs[n][i,j] )  lazy;

(You could write |= rather than := to indicate that you want a copy at the end, but in fact broadcasting float has this effect here anyway.)


Looks really good! This is such a great idea! Manipulating high dimensional arrays (tensors) need really good intuitive abstractions!

It would be good to include installation instructions using Pkg.add method as I am using and so it’s not as convenient to assess the REPL.


A solution inspired by your description

using Flux, Flux.Data.MNIST
imgs = MNIST.images()
# X = cat(float.(imgs)..., dims = 4) # SLOW!!!

x = float.(imgs)
x_stacked = reshape(reduce(hcat, vec.(x)), 28, 28, 1, 60_000)

# all checks out!!!
all([all(x[k] .== x_stacked[:,:,:,k]) for k = 1:60_000]) # true

@mohamed82008 I appreciate your efforts but I like my one-liner better already :wink:


Ah will try to figure this out, hadn’t looked in ages. I can clone but have difficulty installing. The package should be registered in a few days, then perhaps easier.

This indeed seems good. Interesting that mapreduce(vec, hcat, x) does not hit the same optimisation, and is very slow.


No worries. I like your solution too. Just notice that the above line allocates unnecessarily. You can use a lazy map here from

Edit: mapreduce will do the same but it seems that @mcabbott already commented on that.