`plan_rfft` together used with `mul!` and `@view` complains about wrong-strides-output


initially I wanted to have rfft! but since it doesn’t exist, I tried the workaround mentioned in this discourse discussion.

However, mul! complains when used with a plan_rfft generated matrix:

using LinearAlgebra, FFTW

function main()
    # array we want to process
    x = randn((60,40))

    y = rfft(randn((80,60))) 
    # just create a view storage
    y2 = view(y, 1:31, 1:40)

    p = plan_rfft(x)

    @time y2 .= p * x; # this still allocates memory because of p * x
    @time y2 .= p * x;
    @time mul!(y2, p, x); # this should be memory allocation free
    @time mul!(y2, p, x);

Output then is:

julia> main()
  0.000020 seconds (2 allocations: 19.453 KiB)
  0.000012 seconds (2 allocations: 19.453 KiB)
ERROR: ArgumentError: FFTW plan applied to wrong-strides output
 [1] assert_applicable(p::FFTW.rFFTWPlan{Float64, -1, false, 2, UnitRange{Int64}}, X::Matrix{Float64}, Y::SubArray{ComplexF64, 2, Matrix{ComplexF64}, Tuple{UnitRange{Int64}, UnitRange{Int64}}, false})
   @ FFTW ~/.julia/packages/FFTW/G3lSO/src/fft.jl:435
 [2] mul!
   @ ~/.julia/packages/FFTW/G3lSO/src/fft.jl:781 [inlined]
 [3] macro expansion
   @ ./timing.jl:210 [inlined]
 [4] main()
   @ Main ~/julia/FourierTools.jl/src/rfft_issue.jl:17
 [5] top-level scope
   @ REPL[4]:1

Apparently FFTW.jl seems not to correctly accept the view as output.
Am I doing something wrong or what is the issue here?



The output is non-contiguous and apparently FFTW cannot handle that.

1 Like

Strange, that’s true. So views are somehow supported within mul! but only in very special cases:
Might that be easy fixed within FFTW.jl because usual Julia code works with such views?

The following works, where I cropped instead the second dimension instead of the first one (31 == size(y, 1)):

function main()
    # array we want to process
    x = randn((60,40))

    y = rfft(randn((60,50))) 
    # just create a view storage which crops second dimension:
    y2 = view(y, 1:31, 1:40)

    p = plan_rfft(x,  flags=FFTW.UNALIGNED)

    @time y2 .= p * x;
    @time mul!(y2, p, x);

There have been some issues fixed in the past regarding views and rfft but during assignment.

The issue is that you are taking a corner out of a 2D Array. If you take a contiguous chunk out of a 1D Vector it should be fine

Memory is linear, see this graphic


So if you just take a corner from the array (with incomplete columns), there are gaps that FFTW would have to jump over.

Yes, but I actually want that.

I mean, is it just mul! or FFTW.jl? because a normal statement assigning an array to a new, even scattered view array just works. So it must be due to the C wrapper, right?

Yes, mul! in this case is just calling another method from the C-library FFTW which is not set up to handle non-contiguous output.

1 Like

Keep in mind you can just pre-allocate two arrays: one for the unpadded FFTW output and another one with padding.

The issue is, I’m calling several 1D rffts for different dimensions. Apparently in place rfft does not exist for single arbitrary dimensions.

That’s why I wanted to allocate one large array covering at least the size in each dimension needed to accommodate each single result.
For each dimension I simply would have needed a view halving the rfft dimension.

I’m not sure I understand correctly—are performing the FFT only along one axis?

Yeah. It would be nice to prevent allocation of arr_ft every time, because x could be simply in-place overwritten. For the inverse it works, because the resulting size is always the same:

See below some part:

for d in dims
        p = plan_rfft(arr, d)

        arr_ft = p * arr 
        arr_ft .*= ϕ # do something with arr_ft
        mul!(arr, inv(p), arr_ft)

Ok, so firstly that is not what your code currently does. Your plan p performs a 2D FFT — because it was set up with a 2D input array.

If you want to perform a 1D FFT, you need to use a Vector to create the plan p. Then you could write a for-loop to go through your data column by column (you can use views for this) and apply p to each.

also, I’m pretty sure you dont need to use inv(p) — since that probably allocates a new plan for the inverse transform. There should be an ldiv!() method which works like mul!() but performs and inverse transform.

1 Like

I’m not sure if you are correct. I believe the dims argument for fft, rfft is intended for exactly that case:
rfft performs a length(dims)-dimensional Fourier transform over the dims dimensions of the N-dimensional array.

rfft(A [, dims])

  Multidimensional FFT of a real array A, exploiting the fact that the transform has conjugate symmetry in order to save
  roughly half the computational time and storage costs compared with fft. If A has size (n_1, ..., n_d), the result has
  size (div(n_1,2)+1, ..., n_d).

  The optional dims argument specifies an iterable subset of one or more dimensions of A to transform, similar to fft.
  Instead of (roughly) halving the first dimension of A in the result, the dims[1] dimension is (roughly) halved in the
  same way.

But you are not using dims in the code you posted

The issue with ldiv!(A, B) is that, both A and B have different types than inv(P) * B, so can’t work?

rfft(arr, d) with d being an integer does that.

ldiv!(output, p, input)

where is the integer in your code?

Manual too long for ldiv! couldn’t see that :joy:

for d in ntuple(identity, N) is what I’m using. I’m looping over the dimensions of the array.

lmul! doesn’t bring any speedup, I believe internally inv and ldiv are probably doing exactly the same thing