Organizing package with both gpu- and cpu implemntations

Hi everyone,
I am trying to develop a package which provides both gpu-parallelized implementation and cpu-parallelized implementation. How should I organize the project directory? In particular, how should I manage the situation that when users need cpu version, only using MPI will be loaded. And when users need gpu version, only using CUDA will be loaded. Thank you in advanced.

1 Like