Migrating CUDA.jl atomic operations to ParallelStencil.jl using Atomix.jl?

Hello,

I am currently porting my code from CUDA.jl to ParallelStencil.jl. While I have successfully managed the transition for mixed Float64 and ComplexF64 types, I have run into a challenge replacing CUDA-specific atomic operations, specifically:

CUDA.@atomic F[i, j, 1] += ...

Based on the KernelAbstractions.jl documentation, I am considering using Atomix.jl, but I am unclear on how the integration works in practice. Specifically:

  1. If I initialize my setup with @init_parallel_stencil(CUDA, Float64, 2, inbounds=true), will Atomix.jl automatically utilize the correct CUDA atomic instructions?
  2. Are there specific steps or wrappers required to ensure Atomix.jl works seamlessly within a ParallelStencil kernel?

I have looked through the Atomix.jl documentation, but it is quite sparse. Any guidance or examples would be greatly appreciated!

Best regards

ParallelStencil will remain unaware of atomic operations, but the mapping to the correct operations (when available) will happen in the backend of Atomix.

So essentially, there is Atomix.jl/ext/AtomixCUDAExt.jl at main · JuliaConcurrent/Atomix.jl · GitHub and UnsafeAtomics.jl together provide the infrastructure, and the use is transparent to KernelAbstractions and ParallelStencil.

2 Likes

Thank you for your help !