Arrays, MPI, and broadcasting

ranocha · August 17, 2020, 10:02am

I would like to use some kind of AbstractArray based on MPI such that

broadcasting can be used for things like +, *, muladd, which are necessary for simple explicit Runge-Kutta methods (e.g. from OrdinaryDiffEq.jl)
custom functions can be used on the local part of the arrays (as usual in MPI-C/Fortran applications)

It’s okay to distribute the arrays along a single dimension across multiple CPUs.

Which approach would you recommend? I know there are some related implementations like

MPIStateArrays.jl in ClimateMachine.jl Seems to be in active development but tied to ClimateMachine.jl
MPIArrays.jl: Seems to be a proof-of-concept of some functionality (not including broadcasting)

jipolanco · August 17, 2020, 10:21am

You can also look at PencilArrays for yet another approach of arrays that can be distributed along one or more dimensions using MPI.

PencilArrays are currently part of the PencilFFTs package, but can be used independently of the parallel FFT functionality. I’m planning to split them into a separate package soon.

ranocha · August 17, 2020, 10:30am

Looks also interesting, thanks! Does PencilArrays.jl support broadcasting and calling custom MPI kernels? If so, dou you have an example for the latter?

johnh · August 17, 2020, 11:20am

The PencilArrays is cool
At the lower level, MPITopology uses MPI_Cart_create to define a Cartesian MPI communicator.

I guess it is nothing new really, but you can submit a job to a cluster and use information from the environment variables of the job (number of nodes, number of CPU cores per node etc.) to set up these communicators.

Not really related, but I have set up ‘bladesets’ on HPC clusters where the jobs ‘prefer’ to run on nodes close to each other in the topology.

jipolanco · August 17, 2020, 1:42pm

I’m not sure if broadcasting fully works, but that would be really easy to add if it doesn’t.

As for custom MPI kernels, yes it is possible, and it works pretty much the same way as in a C or Fortran MPI application. Indexing a PencilArray, either with linear or Cartesian indices, always yields a value in the local part of the array.

For some examples you can look at the docs, where different approaches for iterating over arrays are compared.

ranocha · August 17, 2020, 2:03pm

Great, thank you all! I’ll dig deeper into PencilArrays and file an issue or something like that if I get into trouble with broadcasting.

Topic		Replies	Views
ANN: MPI.jl v0.10.0: new build process and CUDA-aware support Julia at Scale	26	2735	October 31, 2020
[ANN] PencilFFTs: parallel FFTs of MPI-distributed arrays Package Announcements fftw , hpc , parallel , cluster , mpi	7	2321	December 8, 2021
Trouble using PencilArrays with DifferentialEquations New to Julia	10	899	June 8, 2021
DifferentialEquations.jl+MPI.jl+PencilArrays.jl: Lack of scaling observed Julia at Scale mpi , differentialequation	20	362	December 13, 2024
Partition of distributed arrays in Julia Performance question	0	172	April 5, 2023

Arrays, MPI, and broadcasting

Related topics