Julia can be dramatically slower than Matlab when solving 2D PDE

The rhp of this one will be Float64 if any of the values is Float64:

T[2:N-1,2:N-1] = Tn[2:N-1,2:N-1] + Fo * dt * ((Tn[3:N,2:N-1] - 2 * Tn[2:N-1,2:N-1] + Tn[1:N-2,2:N-1]) / h + (Tn[2:N-1,3:N] - 2 * Tn[2:N-1,2:N-1] + Tn[2:N-1,1:N-2]) / h)

Also, I’m not sure how CUDA likes slices.
One additional tip in their documentation is to avoid Int64, so maybe it’ll be beneficial to also replace factors 2 by Int32(2).

It must be the same as previously, the only change is that I removed N as explicit parameter and it is calculated from the size of the array.

Got it. Thanks!

The CUDA version allocates a lot, even if the CPU version doesn’t. So probably it is copying. But anyway in my machine it is 150x faster than the CPU run.

I wonder if that can be improved by preallocating the space for the slices and copying.

1 Like