Best practice for data-parallel computation using DistributedArrays

jinliangwei · March 30, 2018, 10:14pm

There is a popular programming paradigm in distributed ML training called data parallelism, which I will briefly describe below. What is the best way to program it using DistributedArray (or anything else that might be a better fit)? The reason that I am asking is because I am working on a project that data parallelism is one use case that it’s supposed to make efficient and simple. I would like to better understand how my approach compares to best practices using existing tools in Julia.

Data parallelism: say I have two large arrays A and B. Both arrays are so large that neither can fit in a single machine’s memory. I would like to process array B’s elements one by one. For processing each element of B, I need to read and write to some elements of A and the indices to A depend on the value of the B element. For each element of B, the number of elements of A that I need to access is small enough that they can fit in one machine’s memory.

It seems most natural to create array A and B as two DistributedArrays. My understanding is that DistributedArrays allow my workers to read and write to elements stored in other processes. Am I correct? If I were to access 10 elements of A for processing one element of B, say A[1], A[20], A[84], A[143], …, is this going to cause 10 reads over the network to fetch the relevant values? If those 10 reads happen one after another, then the latency can be very long.

jinliangwei · April 3, 2018, 7:46pm

The question might be too long and ambiguous. I think it would also be very helpful for me if someone could point me to some more sophisticated examples using DistributedArrays or other good distributed libraries so I could learn by myself.

Topic		Replies	Views
Very poor performance with DistributedArrays? General Usage	6	621	May 25, 2020
How to efficiently handle parallelism on a DistributedArray Performance performance , parallel , distributed , loops	8	652	May 24, 2022
DistributedArrays: unexpected behavior before modifying localpart Julia at Scale	1	579	November 29, 2017
Where I can find a complete guide to parallelize code with Julia 1.0 Internals & Design	2	732	October 4, 2018
Parallel computing, sharing data with all workers General Usage	3	979	May 10, 2019

Best practice for data-parallel computation using DistributedArrays

Related topics