Depending on your datatype you can also save a factor of 2 by using real valued FFTs rfft.
If you need performant wrappers for FFT convolution planning (depending on the type), you can also check out FourierTools.jl (no paid advertisement
)
We provide a wrapper to get a planned convolution similar to plan_fft, etc. That provides a little bit of speedup.
If you want to bring memory consumption to almost zero, you can also try fft! but which only works on complex arrays.
julia> using FFTW
julia> function plan_conv!(v; flags=FFTW.MEASURE)
v_ft = fft(v)
p = plan_fft!(copy(v), flags=flags)
function conv!(u)
u .= (p * u) .* v_ft
return inv(p) * u
end
return conv!
end
plan_in_place (generic function with 2 methods)
julia> v = randn(ComplexF32, (1024, 1024));
julia> u = randn(ComplexF32, (1024, 1024));
julia> @time my_conv! = plan_conv!(v);
0.043965 seconds (25.53 k allocations: 9.676 MiB, 14.00% compilation time)
julia> @time my_conv! = plan_conv!(v);
0.039729 seconds (65 allocations: 8.005 MiB)
julia> @time my_conv!(u);
0.028485 seconds (21.88 k allocations: 8.962 MiB, 72.59% compilation time)
julia> @time my_conv!(u);
0.007707 seconds (2 allocations: 160 bytes)
Edit: Note, that plan_fft! might overwrite the provided array with zeros. Hence we should copy it before!
EditEdit: I added now a generic version to FourierTools.jl which works without any allocations during convolution itself (excluding planning).