Handwritten CUDA.jl kernels will recompile with different number types so that’s fine. Though stencils are linear operators so directly defining their derivatives isn’t hard: I’d just define the derivative overload for NNLib.jl’s conv and use it.
Handwritten CUDA.jl kernels will recompile with different number types so that’s fine. Though stencils are linear operators so directly defining their derivatives isn’t hard: I’d just define the derivative overload for NNLib.jl’s conv and use it.