CUDA adapter for FFTW plan

I am writing a code where I want to use a custom structure inside CUDA kernel. Following the CUDA.jl manual (https://cuda.juliagpu.org/stable/tutorials/custom_structs/), I need to write an Adapt.jl adapter for my structure. However, one of the fields of this structure is the Fourier transform FFTW.jl plan. Therefore, first, I have to write the adapter for this FFTW plan. The MWE can be the following:

using Adapt
using CUDA
using FFTW

abstract type ARCH{T} end
struct CPU{T} <: ARCH{T} end
struct GPU{T} <: ARCH{T} end
CPU() = CPU{Float64}()
GPU() = GPU{Float32}()

function Adapt.adapt_storage(::CPU{T}, p::FFTW.cFFTWPlan) where T
    tmp = zeros(Complex{T}, p.sz)
    return plan_fft!(tmp)
end

function Adapt.adapt_storage(::GPU{T}, p::FFTW.cFFTWPlan) where T
    tmp = CUDA.zeros(Complex{T}, p.sz)
    return plan_fft!(tmp)
end


E = zeros(ComplexF64, 128)

p = plan_fft!(E)   # FFTW in-place forward plan for 128-element array of ComplexF64

pa = adapt(GPU(), p)   # CUFFT in-place complex forward plan for 128-element CuArray of ComplexF32

This code works perfectly, but for each call of adapt_storage it allocates tmp array, which in my case can be very large. Therefore, I am searching a way to convert the FFTW plan using a low level definition of the plan structure.

The plan structure in FFTW.jl is defined here. For my adapter I would like to have a code similar to this one:

p = plan_fft!(E)

(; plan, sz, osz, istride, ostride, ialign, oalign, flags, region) = p

pa = FFTW.cFFTWPlan{ComplexF64, -1, true, 1, UnitRange{Int64}}(plan, sz, osz, istride, ostride, ialign, oalign, flags, region)
# It does not work!

Any ideas how I can do it?

1 Like