Anybody tried parallelstencil.jl cuda kernels differentiation with enzyme.jl ?
Since python 1.7 enzyme supports Cuda.jl kernels so in pronciple I suppose it should be possible ?
We are on it. Performing AD on CUDA.jl kernels works (with some limitations wrt math functions, which needs to be further investigated). Next step will be to test with ParallelStencil kernels but there shouldn’t be, a priori, issues.
Fantastic thank you @luraess ! Can’t wait to start using it, tests that you mentioned will be in the enzyme.jl or parallelstencil.jl github repository?
We may add an example to ParallelStencil once everything works as expected. In the meanwhile, we may use a third-party demo repo. I’ll try to keep you posted, or post a follow-up in this Discourse thread.
Thanks !!
To follow-up @Jakub_Mitura, we did not yet experiment using Enzyme.jl with ParallelStencil.jl. However, we made good progress using AD on GPUs within an iterative solver scope. You can find the material in the following demo repo GitHub - PTsolvers/PT-AD: Pseudo-transient auto-diff playground and the corresponding JuliaCon22 presentation on YouTube.