Operations on elements of a common index of an array of matrices

jleman · October 3, 2020, 10:29pm

I have an array of matrices. Is it possible to update groups of common indices in place? The code below doesn’t work but I think conveys what I am trying to do. Yes it would be easier to do this with a 3D matrix, but for other reasons it would be very convenient to keep the array of matrices:

using FFTW

A = [rand(ComplexF64,(100,100)) for _ in 1:1000]

fft!(getindex.(A,1)) #apply fft in-place to the 1000 element grouping consisting of the 1st element of each matrix

tomerarnon · October 3, 2020, 10:42pm

No, I don’t think you can do this with an array of matrices straight out of the box. You can try with view.(A, 1), but considering this is a 1000-element array of 1x1 views with uncorrelated memory address, etc., even if it works (i.e if fft! is written generically enough for this sort of input) I expect it will be pretty slow.

Henrique_Becker · October 3, 2020, 11:22pm

I think this is a clear case of copying data is not always bad.

What I would recommend is:

using FFTW

A = [rand(ComplexF64,(100,100)) for _ in 1:1000]

setindex.((A,), 1, fft(getindex.(A,1))

The broadcast in the setindex may, at least, help avoid allocating intermediary data structs.

jleman · October 4, 2020, 3:49am

Your suggestion was a good compromise. In the end it was still a significant improvement in my overall speed to start with an array of matrices for initial operations, copy each slice into a 3D matrix for the fft function, then copy back to an array of matrices for subsequent operations.

tomerarnon · October 4, 2020, 10:23am

I don’t think the broadcasting could prevent allocating in this case, since the getindex.(...) is inherently creating a new array (although it’s nice and compact to do it this way!). It seems about the same (maybe a bit slower even) as allocating the intermediate:

julia> @btime setindex!.($A, fft!(getindex.($A,1)), 1);
  94.378 μs (28 allocations: 25.94 KiB)

julia> @btime let A = $A
           ffta1 = fft!(getindex.(A, 1))
           @inbounds for i in eachindex(A)
               A[i][1] = ffta1[i]
           end
       end
  82.882 μs (27 allocations: 18.00 KiB)

Henrique_Becker · October 4, 2020, 1:06pm

It is kinda unfair you use @inbounds in one and not in the other. For me:

julia> @btime let A = $A
           ffta1 = fft!(getindex.(A, 1))
           @inbounds for i in eachindex(A)
               A[i][1] = ffta1[i]
           end
       end
  49.373 μs (44 allocations: 18.38 KiB)

julia> @btime @inbounds setindex!.($A, fft!(getindex.($A,1)), 1);
  48.683 μs (45 allocations: 26.31 KiB)

Also, considering the number of allocations in both your benchmark and mine, it is clear the broadcast remove a single allocation, that is actually, 30% of the memory used… The difference in time is small, but my suggestion has a slightly better time in my machine, uses less memory and is more compact…

I now perceive my first code was wrong, this is the reason @jleman found it to be considerably faster. Explain in my next post.

tomerarnon · October 4, 2020, 2:41pm

I’m seeing the same relationship annotating or not annotating both… @inbounds is actually a microsecond slower on average even… I’m running on version 1.5.1.

julia> begin 
       @btime let A = $A
           ffta1 = fft!(getindex.(A, 1))
           @inbounds for i in eachindex(A)
               A[i][1] = ffta1[i]
           end
       end

       @btime @inbounds setindex!.($A, fft!(getindex.($A,1)), 1);
       end;
  82.740 μs (27 allocations: 18.00 KiB)
  93.370 μs (28 allocations: 25.94 KiB)

julia> begin 
       @btime let A = $A
           ffta1 = fft!(getindex.(A, 1))
           for i in eachindex(A)
               A[i][1] = ffta1[i]
           end
       end

       @btime setindex!.($A, fft!(getindex.($A,1)), 1);
       end;
  81.483 μs (27 allocations: 18.00 KiB)
  92.707 μs (28 allocations: 25.94 KiB)

Am I misunderstanding something? From the looks of it, the setindex!. method is the one allocating more memory.

Henrique_Becker · October 4, 2020, 3:05pm

In my first post I wrote:

setindex!.((A,), 1, fft(getindex.(A,1))

what is wrong. This will write the value 1 in the position given by the fft, what makes no sense:

A[fft(getindex.(A,1))] .= 1

@jleman has probably corrected the order of the parameters to:

setindex!.((A,), fft(getindex.(A,1), 1)

But this is:

A[1] .= fft(getindex.(A,1)

What explains some gain in performance, it is overwriting contiguous positions in the first matrix with the the fft values, instead of overwriting the first value in each of the matrices. (I am not sure, however, how it did not have a type problem, as the fft return is a vector, not a matrix.)

The correct version is:

@btime @inbounds setindex!.($A, fft!(getindex.($A,1)), 1);

That somehow is slightly faster than looping to me, even if it makes one extra allocation. (I will need to investigate further, it does not make sense to me why it does make an extra allocation nor why it is faster/equivalent despite that.)

I am sorry for the noise.

Topic		Replies	Views
A question on broadcasting fft for vectors of matrices Numerics question , arrays , broadcasting	9	755	February 20, 2022
Apply array function such as fft to nested array New to Julia fftw , arrays	2	575	August 25, 2021
Counterintuitive array update allocations Performance	11	572	July 17, 2021
Best strategy for fastest way to extract arrays built from matrix elements and then reassemble after ifft() General Usage fftw , performance , matrices	2	740	September 22, 2020
Multiplication of sub matrices General Usage	7	665	September 7, 2018

Operations on elements of a common index of an array of matrices

Related topics