Some Questions about Parallelization

Juser · March 4, 2018, 10:19pm

I have a couple questions about parallelism in Julia that I am hoping someone might know the answers to:

(1) Is there an easy way to pass workers a copy of a variable? For example, if on the master process I have created X and would like to create Y=sum(X) on each of the workers, then each of the workers need to see X.

(2) Is there any way to share a vector of shared vectors? I have a number of cases where I have Vector{SharedArray} or even Vector{Vector{SharedArray}} being a convenient way to structure data. However using such storage hurts the parallel performance because of the overhead of passing The Vector or VectorVector part of the above to the worker processes. My current thought is to declare each of these pointing vectors as an @everywhere const so that each processor has a copy, but I’m not sure that this is the cleanest or best way to do this.

Any suggestions or feedback would be appreciated. Thank you!

Tomas_Pevny · March 5, 2018, 6:49am

I use
https://github.com/ChrisRackauckas/ParallelDataTransfer.jl
for the first one.

I do not know the answer for the second.

Juser · March 5, 2018, 4:13pm

Thanks! This looks like exactly what I need for (1).

tk3369 · March 6, 2018, 6:45am

I have an interest to share a struct of SharedArray’s as well but from my previous tests it seems to be non-performant to pass that around as you indicated above.

I think the workaround is to create these SharedArray’s individually in the global scope… and if you have many of them then you would come up with a naming convention that distinguish them from each other.

Juser · March 13, 2018, 10:18pm

I have done some tests, and you’re right. Passing SharedArrays around in vectors or structs leads to extremely poor performance.

Are you suggesting using something that is like a macro to reference the different SharedArrays? The nice thing about vectors of SharedArrays is that I can reference them by their index in my code.

tk3369 · March 14, 2018, 4:56am

For some reasons, I cannot replicate this problem anymore… Do you have any minimum working example (MWE) to share?

tk3369 · March 18, 2018, 4:27am

I thought the performance was bad when passing SharedArray’s in another data structure but my latest test results do not show any material difference.

Juser · March 18, 2018, 2:51pm

@tk3369, thanks you for following up on this. I have run similar tests, except passing large numbers of shared arrays one-by-one and without the use of $, and the performance was generally poor. I will not have access to the machine on which I ran the test for a few days, but I will look again and try to understand what is driving the difference in performance. I will try to report back later this week.

tk3369 · March 18, 2018, 5:54pm

You’re welcome.

You must use $ to avoid accessing variable in the global scope (which is slow). More details at the BenchmarkTools README file.

Topic		Replies	Views
How to create SharedArrays on other workers Performance	9	1046	February 3, 2020
Is Shared Memory Parallel Tasks Possible in Julia? Performance	4	1652	August 17, 2019
Memory Usage and SharedArrays Julia at Scale parallel , memory-allocation , distributed	8	1422	December 6, 2019
All workers not have access to a sharedarray New to Julia question , parallel , sharedarrays	3	698	November 26, 2020
Using @distributed for loop New to Julia question , loops , parallel-computing	10	345	August 5, 2024

Some Questions about Parallelization

Related topics