Custom distribution for DistributedArrays

Maurizio_Tomasi · March 19, 2018, 12:08pm

Hi to everybody,

I am going to port an old code by mine to Julia. Since this code uses MPI, I figured out I could use DistributedArrays.jl. My code implements a calculation on a very large arrays (~1 TB) and uses MPI to distribute the data among a number of nodes in a computing cluster. The way the data are distributed depends on a number of factors: each node gets roughly the same amount of data, but not exactly. Suppose for instance that my very long array has 100 elements, and that I am running the code on two nodes; depending on the way the data have been taken, I might end up splitting the array in two chunks containing 56 and 44 elements respectively. This helps the computation, as many operations on the data have to be performed on the two sequences of 56 and 44 elements, and I found that using this split instead of the simpler 50+50 scheme improves the speed.

Is there a way to tell a function like dzero to use a custom partitioning scheme by specifying exactly how many elements to use in each worker? I was able to find how to specify the number of splits in each dimension, but this seems to produce partitions of equal size (i.e., 50+50).

Thanks a lot,
Maurizio.

Topic		Replies	Views
How to distribute array by column? Julia at Scale question , package	4	837	May 10, 2019
Partition of distributed arrays in Julia Performance question	0	177	April 5, 2023
Allocation of distributed arrays to workers General Usage parallel	5	611	April 20, 2020
Customizing SharedArray General Usage parallel	4	338	June 15, 2021
Dividing a 1D domain into subdomains using Distributed Arrays (boundary element implementation) New to Julia distributed	1	266	May 27, 2021

Custom distribution for DistributedArrays

Related topics