The rhp of this one will be Float64
if any of the values is Float64
:
T[2:N-1,2:N-1] = Tn[2:N-1,2:N-1] + Fo * dt * ((Tn[3:N,2:N-1] - 2 * Tn[2:N-1,2:N-1] + Tn[1:N-2,2:N-1]) / h + (Tn[2:N-1,3:N] - 2 * Tn[2:N-1,2:N-1] + Tn[2:N-1,1:N-2]) / h)
Also, I’m not sure how CUDA likes slices.
One additional tip in their documentation is to avoid Int64
, so maybe it’ll be beneficial to also replace factors 2
by Int32(2)
.
It must be the same as previously, the only change is that I removed N
as explicit parameter and it is calculated from the size of the array.