in my attempts to play with CUDA in Julia, I’ve come accross something I can’t really understand -hopefully because I’m doing something wrong. The fact is that in my calculations I need to perform Fourier transforms, which I do wiht the fft() function. But sadly I find that the result of performing the fft() on the CPU, and on the same array transferred to the GPU, is different. My code is simply
using CuArrays using CUDAnative using CUDAdrv using FFTW N = 64; A = rand(Float32,N,N,N); B = fft(A); Ad = cu(A); Bd = fft(Ad); BB = Array(Bd); maximum(abs.(B-BB)) > 0.015625f0
…and the difference worsens with increasing N. For instance with N=256 I get a difference (in a single run) of 0.2275149f0. Of course the value of the FT also grows, so at this point I see small differences that are neither negligible nor impossible to live with.
So is this an expected behaviour? Or am I doing something weird?