CUDA.jl PR1903 added support for FFTs along more directions with CUDA.jl v5. I am a bit confused how this works in practice, as I can’t find it documented. The PR states
This is achieved by allowing fft-plans to have fewer dimensions than the data they are applied to. The trailing dimensions are treated as non-transform directions and transforms are executed sequentially.
So, in practice I’d like to have one plan that applies both to single samples and batched ones, but this doesn’t work as I expected:
using CUDA A = CUDA.rand(100); B = CUDA.rand(100,10); plan = CUDA.CUFFT.plan_fft(A) plan * A # works plan * B # doesn't work
Am I misunderstanding what the PR added, or using it wrong? (I am on CUDA.jl v5)
Maybe @RainerHeintzmann could help out?