Ok, it sounds like the code you are using in your program and the code you posted above are quite different. Especially if you are using a 2D FFT in the example above and a 1D FFT in your private code.
Sorry for that, that was confusing
The main issue is still an in-place rfft
It seems you need to specify an array with the right stridedness when making the plan:
julia> using FFTW, LinearAlgebra
julia> A = randn(1000,1000);
julia> row = first(eachrow(A));
julia> typeof(row)
SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true}
julia> col = first(eachcol(A));
julia> typeof(col)
SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}
julia> output = Vector{Complex{Float64}}(undef, length(rfftfreq(1000)));
julia> size(output)
(501,)
julia> row_plan = plan_rfft(row)
FFTW real-to-complex plan for 1000-element (1000,)-strided array of Float64
(rdft2-ct-dit/20
(hc2c-direct-20/38/0 "hc2cfdftv_20_sse2"
(rdft2-ct-dit/2
(hc2c-direct-2/2/0 "hc2cfdftv_2_sse2"
(rdft2-r2hc-direct-2 "r2cf_2")
(rdft2-r2hc01-direct-2 "r2cfII_2"))
(dft-direct-10 "n1fv_10_sse2"))
(rdft2-r2hc01-direct-20 "r2cfII_20"))
(dft-vrank>=1-x10/1
(dft-ct-dit/5
(dftw-direct-5/4 "t3fv_5_sse2")
(dft-directbuf/14-10-x5 "n2fv_10_sse2"))))
julia> mul!(output, row_plan, row); # works
julia> col_plan = plan_rfft(col)
FFTW real-to-complex plan for 1000-element array of Float64
(rdft2-ct-dit/20
(hc2c-direct-20/38/0 "hc2cfdftv_20_sse2"
(rdft2-ct-dit/2
(hc2c-direct-2/2/0 "hc2cfdftv_2_sse2"
(rdft2-r2hc-direct-2 "r2cf_2")
(rdft2-r2hc01-direct-2 "r2cfII_2"))
(dft-direct-10 "n1fv_10_sse2"))
(rdft2-r2hc01-direct-20 "r2cfII_20"))
(dft-vrank>=1-x10/1
(dft-ct-dit/5
(dftw-direct-5/4 "t3fv_5_sse2")
(dft-direct-10-x5 "n2fv_10_sse2"))))
julia> mul!(output, col_plan, col); # works
julia> mul!(output, col_plan, row);
ERROR: ArgumentError: FFTW plan applied to wrong-strides array
Stacktrace:
[1] assert_applicable(p::FFTW.rFFTWPlan{Float64, -1, false, 1, UnitRange{Int64}}, X::SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true})
@ FFTW ~/.julia/packages/FFTW/G3lSO/src/fft.jl:424
[2] assert_applicable(p::FFTW.rFFTWPlan{Float64, -1, false, 1, UnitRange{Int64}}, X::SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true}, Y::Vector{ComplexF64})
@ FFTW ~/.julia/packages/FFTW/G3lSO/src/fft.jl:431
[3] mul!(y::Vector{ComplexF64}, p::FFTW.rFFTWPlan{Float64, -1, false, 1, UnitRange{Int64}}, x::SubArray{Float64, 1, Matrix{Float64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true})
@ FFTW ~/.julia/packages/FFTW/G3lSO/src/fft.jl:781
[4] top-level scope
@ REPL[73]:1
julia> mul!(output, row_plan, col);
ERROR: ArgumentError: FFTW plan applied to wrong-strides array
Stacktrace:
[1] assert_applicable(p::FFTW.rFFTWPlan{Float64, -1, false, 1, UnitRange{Int64}}, X::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true})
@ FFTW ~/.julia/packages/FFTW/G3lSO/src/fft.jl:424
[2] assert_applicable(p::FFTW.rFFTWPlan{Float64, -1, false, 1, UnitRange{Int64}}, X::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, Y::Vector{ComplexF64})
@ FFTW ~/.julia/packages/FFTW/G3lSO/src/fft.jl:431
[3] mul!(y::Vector{ComplexF64}, p::FFTW.rFFTWPlan{Float64, -1, false, 1, UnitRange{Int64}}, x::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true})
@ FFTW ~/.julia/packages/FFTW/G3lSO/src/fft.jl:781
[4] top-level scope
@ REPL[74]
(actually learned this today myself!)
I’m not sure how that helps for me.
My input to the rfft is always the same, what’s changes is just the output because of the different dimension I do the Fourier transform over.
Ah ok. It seems FFTW.jl doesn’t expose a nice constructor for making plans in that case, but the internal FFTW objects do keep track of the stride of the output as well as the input, and we can write our own constructor to make use of that:
julia> using FFTW, LinearAlgebra
julia> output = Matrix{Complex{Float64}}(undef, 501, 501);
julia> A = randn(1000,1000);
julia> col = first(eachcol(A));
julia> col_plan = plan_rfft(col);
julia> output = Matrix{Complex{Float64}}(undef,501,501);
julia> mul!(@view(output[:, 1]), col_plan, col); # works
julia> mul!(@view(output[1, :]), col_plan, col);
ERROR: ArgumentError: FFTW plan applied to wrong-strides output
Stacktrace:
[1] assert_applicable(p::FFTW.rFFTWPlan{Float64, -1, false, 1, UnitRange{Int64}}, X::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, Y::SubArray{ComplexF64, 1, Matrix{ComplexF64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true})
@ FFTW ~/.julia/packages/FFTW/G3lSO/src/fft.jl:435
[2] mul!(y::SubArray{ComplexF64, 1, Matrix{ComplexF64}, Tuple{Int64, Base.Slice{Base.OneTo{Int64}}}, true}, p::FFTW.rFFTWPlan{Float64, -1, false, 1, UnitRange{Int64}}, x::SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true})
@ FFTW ~/.julia/packages/FFTW/G3lSO/src/fft.jl:781
[3] top-level scope
@ REPL[123]:1
julia> make_rplan(input, output, region = 1:ndims(input); flags = FFTW.ESTIMATE, timelimit=FFTW.NO_TIMELIMIT) = FFTW.rFFTWPlan{eltype(input),FFTW.FORWARD,false,ndims(input)}(input, output, region, flags, timelimit)
make_rplan (generic function with 2 methods)
julia> p=make_rplan(col, @view(output[1,:]))
FFTW real-to-complex plan for 1000-element array of Float64
(rdft2-ct-dit/20
(hc2c-direct-20/38/0 "hc2cfdftv_20_sse2"
(rdft2-ct-dit/2
(hc2c-direct-2/2/0 "hc2cfdftv_2_sse2"
(rdft2-r2hc-direct-2 "r2cf_2")
(rdft2-r2hc01-direct-2 "r2cfII_2"))
(dft-direct-10 "n1fv_10_sse2"))
(rdft2-r2hc01-direct-20 "r2cfII_20"))
(dft-vrank>=1-x10/1
(dft-ct-dit/5
(dftw-direct-5/4 "t3fv_5_sse2")
(dft-direct-10-x5 "n1fv_10_sse2"))))
julia> mul!(@view(output[1, :]), p, col); # works
I think the FFTW.rFFTWPlan
constructor I’m using here is not part of the public API so this is unfortuantely using the internals of the package (which could therefore break if FFTW.jl updates in an incompatible way). Also, I’m not sure my make_rplan
is correct for multidimensional inputs/outputs; I kind of guessed that the type parameter N
should be ndims(input)
for example…