GPU/CPU Agnostic FFT code

,

I am looking to port my Professor’s 15 year old C++ code to Julia in a GPU/CPU Agnostic way maybe by using AcceleratedKernels.jl. The C++ code uses FFTW library, I know CUDA has CUDAFFTs but I don’t know if AcceleratedKernels.jl or JuliaGPU has such a framwork for FFTs. Is this a good idea? Or is this too ambitious and I should go for CUDAFFTs or something else?

Julia has a FFTW.jl library that dispatch on cpu and gpu and no problem mixing CuArray programing and kernel programing, both will be agnostic, if you need something more than AcceleratedKernels.jl, KernelAbstraction.jl creates agnostic kernels for cpu and gpu. Also, CxxWrap.jl can help you port things that you don’t know how to translate but that won’t be agnostic. For instance in this code

using FFTW, CUDA, BenchmarkTools
julia> function try_FFT_on_cpu()
           values = rand(256, 256, 256)
           value_complex = ComplexF32.(values)
           cvalues = similar((value_complex), ComplexF32)
           copyto!(cvalues, values)
           cy = similar(cvalues)
           cF = plan_fft!(cvalues, flags=FFTW.MEASURE)
           @btime a = ($cF*$cy)
           return nothing
       end
try_FFT_on_cpu (generic function with 1 method)

julia> function try_FFT_on_cuda()
           values = rand(256, 256, 256)
           value_complex = ComplexF32.(values)
           cvalues = similar(cu(value_complex), ComplexF32)
           copyto!(cvalues, values)
           cy = similar(cvalues)
           cF = plan_fft!(cvalues)
           @btime CUDA.@sync a = ($cF*$cy)
           return nothing
       end
try_FFT_on_cuda (generic function with 1 method)

from Unreasonably fast FFT on CUDA - #8 by roflmaostc

2 Likes

It’s not FFTW.jl but instead AbstractFFTs.jl and CUDA.jl which dispatches CuArray to CUDA.

6 Likes