Indexing multidimensional arrays of arbitrary dimension

hrchen · June 26, 2019, 3:37pm

I am writing a simulation that outputs a multidimensional array A that changes over time, and whose ndims depends on the input dimension (1 or 2 for now). I would like to save A over time in a larger array A_t in my for-loop, for which I used CartesianIndices. The following is a MWE:

In 1D:

A_t = zeros(10,2)
A = ones(10)
@btime A_t[CartesianIndices(A),1] = A
@btime A_t[CartesianIndices(A),1] .= A

gives:

133.829 ns (1 allocation: 16 bytes)
1.287 μs (8 allocations: 176 bytes)

Clearly, not broadcasting is a lot faster.

In 2D:

A_t = zeros(10,10,2)
A = ones(10,10)
# @btime A_t[CartesianIndices(A),1] = A
@btime A_t[CartesianIndices(A),1] .= A

Now, not broadcasting throws the following error:

DimensionMismatch("tried to assign 10×10 array to 100×1 destination")

It appears that appending a 1 at the end LinearIndexes the CartesianIndices() object. I would like to append a 1 behind all the CartesianIndices without broadcasting (which would be slower in 1D), but have not found a way to do so.

Any advice? Thanks!

mbauman · June 26, 2019, 3:47pm

Use $ to flag the non-global “arguments” to your benchmarked expression in order to get a true sense of the performance here.

julia> @btime A_t[CartesianIndices(A),1] = A;
  157.917 ns (1 allocation: 16 bytes)

julia> @btime A_t[CartesianIndices(A),1] .= A;
  1.449 μs (8 allocations: 176 bytes)

julia> @btime $A_t[CartesianIndices($A),1] = $A;
  37.287 ns (0 allocations: 0 bytes)

julia> @btime $A_t[CartesianIndices($A),1] .= $A;
  195.013 ns (2 allocations: 64 bytes)

The reason you’re seeing this is because CartesianIndices isn’t optimized within SubArrays — that’s an optimization that we could probably implement quite easily.

simeonschaub · June 26, 2019, 3:54pm

This confused me a bit at first, too, since:

julia> A_t[CartesianIndices(A), 1]
10×10 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0

So getindex actually materializes into a matrix, but when using setindex! with CartesianIndices, it behaves like assigning to a one-dimensional array. Your right side therefore always needs to be one-dimensional, which is why this happens to work in the first case. You can probably get the behavior you want with either of those two:

julia> @btime $A_t[CartesianIndices($A),1] = $A[:];
  521.353 ns (1 allocation: 896 bytes)

julia> @btime $A_t[CartesianIndices($A),1] = @view $A[:];
  570.647 ns (3 allocations: 128 bytes)

A[:] basically flattens A into a one-dimensional Vector, so the dimensions of the left and right hand side now match. This is still significantly faster than broadcasting on my machine.

mbauman · June 26, 2019, 3:57pm

Note that when you use square-bracketed indexing as the left hand side of a = assignment, it’s not doing getindex (and thus isn’t materializing the array). In the case of A[I, J] = X, it’s doing setindex!(A, X, I, J), and in the case of A[I, J] .= X it’s doing (essentially) broadcast!(view(A, I, J), identity, X). It’s the view with CartesianIndices that’s slow.

hrchen · June 26, 2019, 4:29pm

Thank you! This is indeed faster than broadcasting for me too. And in 1D,

A_t = zeros(10,2)
A = ones(10)
@btime $A_t[CartesianIndices($A),1] = $A
@btime $A_t[CartesianIndices($A),1] = $A[:]
@btime $A_t[CartesianIndices($A),1] = @view $A[:]

gives:

28.836 ns (0 allocations: 0 bytes)
80.790 ns (1 allocation: 160 bytes)
43.803 ns (1 allocation: 48 bytes)

so flattening+@view is not too much slower. In my original post I was hoping to manipulate the CartesianIndices like I can to a Tuple of Arrays, like so:

x=(ones(2),ones(2))
append!.(x,1)

but I don’t think CartesianIndices can be manipulated in this way.

simeonschaub · June 26, 2019, 5:13pm

I understand that, but intuitively, one would expect, that if getindex returns a matrix, setindex with the same indices would behave similarly, as with matrices.

hrchen:

In my original post I was hoping to manipulate the CartesianIndices like I can to a Tuple of Arrays, like so:
x=(ones(2),ones(2))
append!.(x,1)
but I don’t think CartesianIndices can be manipulated in this way.

If you do inds = collect(CartesianIndices(A)), you can modify inds just like any other array, if that’s what you’re looking for. Maybe you can specify a bit more, what behavior exactly you want, so we can help you.

hrchen · June 26, 2019, 5:27pm

julia> x=zeros(2); ind=collect(CartesianIndices(x))
2-element Array{CartesianIndex{1},1}:
 CartesianIndex(1,)
 CartesianIndex(2,)

I would like to obtain

 CartesianIndex(1, 1)
 CartesianIndex(2, 1)

from ind, perhaps by appending the index 1 to the CartesianIndexes. I was hoping that this would lead to the fastest way to assign A to A_t, that would work in arbitrary dimension.

mbauman · June 26, 2019, 5:42pm

You’ll see much better performance all around if you just use :s instead. You can construct the CartesianIndices you want, though:

julia> CartesianIndices((axes(A)..., 1:1))
10×1 CartesianIndices{2,Tuple{Base.OneTo{Int64},UnitRange{Int64}}}:
 CartesianIndex(1, 1)
 CartesianIndex(2, 1)
...

hrchen · June 26, 2019, 8:21pm

By :s, you mean metaprogramming? I tried the following:

d=2
A_t = zeros(10,10,2)
A = ones(10,10)
eval(Meta.parse("A_t["*repeat(":,",d-1)*":,1] = A"))

which works. However, when I embed the above in a function, it returns an UndefVarError: A_t not defined
This is related to several threads on eval not having access to function scope. Is there a simple workaround here?

simeonschaub · June 26, 2019, 8:28pm

That’s a really ugly hack. Why not do A_t[fill(:, 2)..., 1] = A instead?

mbauman · June 26, 2019, 8:28pm

No, I mean just using colons directly to select entire dimensions, like A_t[:, 1] or B[:, :, 1], etc. Those can be much more efficient than CartesianIndices because they convey additional meaning — we know (in the type domain) that they select an entire dimension whereas CartesianIndices may select any arbitrary rectangular subset. The former can compile down to really efficient computations, whereas the latter cannot.

You can also check out EllipsisNotation.jl for arbitrary numbers of colons.

hrchen · June 26, 2019, 9:00pm

Thanks! Both fill(:,d) and EllipsisNotation.jl are what I’m looking for. And the former is almost as fast as A_t[:,:,1] in 2D.

mbauman · July 19, 2019, 1:42am

A post was split to a new topic: Help reshaping a 3D array

Topic		Replies	Views
Accessing multi-dimensional arrays in 1 go General Usage question , array	5	429	September 15, 2020
Iteration with CartesianIndices(array) is slow when the dimension of the array is large Performance indexing	2	814	August 25, 2019
Extra dimensions and CartesianIndices General Usage indexing	2	282	April 19, 2023
Cheapest way for creating a CartesianIndex General Usage question , cartesianindices	9	475	November 6, 2022
CartesianIndices syntax General Usage indexing	6	685	April 9, 2019

Indexing multidimensional arrays of arbitrary dimension

Related topics