Operations on elements of a common index of an array of matrices

I have an array of matrices. Is it possible to update groups of common indices in place? The code below doesn’t work but I think conveys what I am trying to do. Yes it would be easier to do this with a 3D matrix, but for other reasons it would be very convenient to keep the array of matrices:

using FFTW

A = [rand(ComplexF64,(100,100)) for _ in 1:1000]

fft!(getindex.(A,1)) #apply fft in-place to the 1000 element grouping consisting of the 1st element of each matrix

No, I don’t think you can do this with an array of matrices straight out of the box. You can try with view.(A, 1), but considering this is a 1000-element array of 1x1 views with uncorrelated memory address, etc., even if it works (i.e if fft! is written generically enough for this sort of input) I expect it will be pretty slow.

3 Likes

I think this is a clear case of copying data is not always bad.

What I would recommend is:

using FFTW

A = [rand(ComplexF64,(100,100)) for _ in 1:1000]

setindex.((A,), 1, fft(getindex.(A,1))

The broadcast in the setindex may, at least, help avoid allocating intermediary data structs.

3 Likes

Your suggestion was a good compromise. In the end it was still a significant improvement in my overall speed to start with an array of matrices for initial operations, copy each slice into a 3D matrix for the fft function, then copy back to an array of matrices for subsequent operations.

2 Likes

I don’t think the broadcasting could prevent allocating in this case, since the getindex.(...) is inherently creating a new array (although it’s nice and compact to do it this way!). It seems about the same (maybe a bit slower even) as allocating the intermediate:

julia> @btime setindex!.($A, fft!(getindex.($A,1)), 1);
  94.378 μs (28 allocations: 25.94 KiB)

julia> @btime let A = $A
           ffta1 = fft!(getindex.(A, 1))
           @inbounds for i in eachindex(A)
               A[i][1] = ffta1[i]
           end
       end
  82.882 μs (27 allocations: 18.00 KiB)
2 Likes

It is kinda unfair you use @inbounds in one and not in the other. For me:

julia> @btime let A = $A
           ffta1 = fft!(getindex.(A, 1))
           @inbounds for i in eachindex(A)
               A[i][1] = ffta1[i]
           end
       end
  49.373 μs (44 allocations: 18.38 KiB)

julia> @btime @inbounds setindex!.($A, fft!(getindex.($A,1)), 1);
  48.683 μs (45 allocations: 26.31 KiB)

Also, considering the number of allocations in both your benchmark and mine, it is clear the broadcast remove a single allocation, that is actually, 30% of the memory used… The difference in time is small, but my suggestion has a slightly better time in my machine, uses less memory and is more compact…

I now perceive my first code was wrong, this is the reason @jleman found it to be considerably faster. Explain in my next post.

2 Likes

I’m seeing the same relationship annotating or not annotating both… @inbounds is actually a microsecond slower on average even… I’m running on version 1.5.1.

julia> begin 
       @btime let A = $A
           ffta1 = fft!(getindex.(A, 1))
           @inbounds for i in eachindex(A)
               A[i][1] = ffta1[i]
           end
       end

       @btime @inbounds setindex!.($A, fft!(getindex.($A,1)), 1);
       end;
  82.740 μs (27 allocations: 18.00 KiB)
  93.370 μs (28 allocations: 25.94 KiB)

julia> begin 
       @btime let A = $A
           ffta1 = fft!(getindex.(A, 1))
           for i in eachindex(A)
               A[i][1] = ffta1[i]
           end
       end

       @btime setindex!.($A, fft!(getindex.($A,1)), 1);
       end;
  81.483 μs (27 allocations: 18.00 KiB)
  92.707 μs (28 allocations: 25.94 KiB)

Am I misunderstanding something? From the looks of it, the setindex!. method is the one allocating more memory.

2 Likes

In my first post I wrote:

setindex!.((A,), 1, fft(getindex.(A,1))

what is wrong. This will write the value 1 in the position given by the fft, what makes no sense:

A[fft(getindex.(A,1))] .= 1 

@jleman has probably corrected the order of the parameters to:

setindex!.((A,), fft(getindex.(A,1), 1)

But this is:

A[1] .= fft(getindex.(A,1)

What explains some gain in performance, it is overwriting contiguous positions in the first matrix with the the fft values, instead of overwriting the first value in each of the matrices. (I am not sure, however, how it did not have a type problem, as the fft return is a vector, not a matrix.)

The correct version is:

@btime @inbounds setindex!.($A, fft!(getindex.($A,1)), 1);

That somehow is slightly faster than looping to me, even if it makes one extra allocation. (I will need to investigate further, it does not make sense to me why it does make an extra allocation nor why it is faster/equivalent despite that.)

I am sorry for the noise.

3 Likes