Hi everyone!
I am a physicist and photonics engineer. Recently, Meta released a Julia package called Khronos to perform FDTD (Finite-Difference Time-Domain) simulations. The basic idea is to iteratively propagate light through a staggered grid by solving Maxwell’s equations.
Khronos is built on KernelAbstractions.jl, allowing it to run on both CPUs (x86 or Apple Silicon) and GPUs. However, it currently only supports a single GPU, which can be limiting for larger problems.
My goal is to extend Khronos to support multi-GPU simulations. I am reaching out for advice on which Julia packages could be useful for this task. Based on my research, I believe I will need to use something like MPI (MPI.jl) for inter-GPU communication. Following a suggestion from @luraess and @vchuravy on my original issue in the KA.jl GitHub, I am posting here for additional insights.
Here are some of the specific challenges I am facing:
- Halo Exchange: I need an efficient way to share boundary information (halo exchange) between adjacent subdomains after each time step. This is essential for managing the boundaries between GPUs.
- Monitors: I need to implement monitors to capture snapshots of the field at regular intervals. One type of monitor will capture the field after a fixed number of iterations, while another one will perform a Fourier transform on the data. To minimize communication overhead, I am considering processing the monitors on each GPU locally and only outputting the results at the end of the simulation.
Chmy.jl was initialy suggested, which seems suitable for managing halo exchange, but I am unsure if it’s the best choice for handling the monitors efficiently.
Another package that looks promising for multi-GPU parallelization is ImplicitGlobalGrid.jl.
Any advice, package recommendations, or insights would be greatly appreciated!
I will continue to dig on the subect on my side. And hopefully, I will get some great news soon!
Thanks for your help,
Lucas