[ANN] MPIHaloArrays.jl

smillerc · March 4, 2022, 12:42am

MPIHaloArrays.jl is a new package that provides the MPIHaloArray array type, which is a subtype of AbstractArray, that facilitates the exchange of halo, or ghost, cells commonly used in domain-decomposition methods high performance computing problems. In my case, I use this method to solve the compressible Euler equations for fluid dynamics in parallel. While there are other existing libraries that have similar decomposition functionality, such as MPIArrays.jl, PencilArrays.jl, and ImplicitGlobalGrid.jl, I wanted to create an array type specifically for this task. Each library only provides only a portion of the functionality I needed, so I decided to try creating a new one! I’ve done this type of parallelization in Fortran 2018 with Coarrays, but have recently written a multi-threaded, multi-physics code in Julia (compressible-hydro + thermal conduction) for my dissertation. MPIHaloArray’s will form the basis for the transition to domain decomposed parallelization.

Domain-decomposition splits a large problem into small subdomains that live on different MPI ranks. Stencil operations commonly used in solving PDEs require neighbor information, and this is done with halo cells along the edge of each subdomain. In the image below, the domain is decomposed into 4 subdomains, with a single layer of halo cells on all sides (process 4 is shown exchanging orthogonal neighbor information with processes 3 and 2). Domain decomposition can happen in 1D, 2D, and 3D.

Features

Provides the MPIHaloArray array type
Domain decomposition can happen in1D, 2D, or 3D
Neighbor exchange with arbitrarily sized halo cell regions. Note, the number of halo cells is currently fixed across all dimensions
Communication is currently handled by MPI.ISend and MPI.IRecv underneath, but future versions will give the option for one-sided communication with MPI.Put and MPI.Get
Halo exchange can either be orthogonal-only (e.g. [i+1,j,k]) , or include corner information as well (e.g [i+1,j+1,k])

TODO

GPU testing – I don’t currently have a local GPU to test this with, but future work will include GPU-aware MPI
Get rid of the limitation of 1D, 2D, or 3D arrays. For example, an array U could have dimensions for [q,i,j], but halo exchanges are only done on i and j.
Optimization of neighbor communication (caching, etc…)
Finish one-sided implementation

Examples

Quickstart

using MPI, MPIHaloArrays

MPI.Init()
const comm = MPI.COMM_WORLD
const rank = MPI.Comm_rank(comm)
const nprocs = MPI.Comm_size(comm)
const root = 1

# Create a topology type the facilitates neighbor awareness and global size
topology = CartesianTopology(comm, 
                             [4,2], # domain decompose in a 4x2 grid of ranks
                             [false, false]) # no periodic boundaries

nhalo = 2

# Create some data on the local rank
local_data = rand(10,20) * rank
A = MPIHaloArray(local_data, topology, nhalo)
fillhalo!(A, -1)

# Perform halo exchange
updatehalo!(A)

# Or start with a large global array
B_global = rand(512,512)

# divide by rank and return B::MPIHaloArray
B_local = scatterglobal(B_global, root, nhalo, topology; 
           do_corners = false) # if your algorithm doesn't need 
                                # corner info, this will save communication
# do some work
# ....
updatehalo!(B_local)
# do some work
# ....

# and pull it back to the root rank
B_result = gatherglobal(B_local; root=root)

GC.gc()
MPI.Finalize()

2D Diffusion

See in the example here in the github repository.

Documentation

See the docs here

I’d love to hear your feedback or contributions!

hunKing · August 14, 2024, 8:07pm

Hey!

Thanks for the package; I’ve been studying it so I can use it in my thesis. Doing domain decomposition manually is tedious so this will go a long way to facilitate this process. Thank you for your effort!

smillerc · August 15, 2024, 12:49pm

Glad to hear that you’re using (or planning to use) it! Let me know if anything doesn’t work or if you see poor performance.

Topic		Replies	Views
One-sided MPI with Subarrays Julia at Scale mpi	4	665	September 22, 2021
ParallelStencil w/MPI Julia at Scale mpi	2	510	April 6, 2022
Remote access arrays in a numerical grid Julia at Scale	6	995	April 19, 2019
ANN: MPI.jl v0.10.0: new build process and CUDA-aware support Julia at Scale	26	2735	October 31, 2020
Arrays, MPI, and broadcasting Julia at Scale question	5	899	August 17, 2020

[ANN] MPIHaloArrays.jl

Features

TODO

Examples

Documentation

Related topics