Struggling with pmap

rveltz · September 5, 2019, 2:55pm

Hello,

I am struggling to understand why pmap is slow. I use the following fast function rand().

using Distributed
addprocs(4)
@everywhere using  Random
@everywhere g(x) = rand()
 @time map(g,1:100_000_000);
  0.697132 seconds (10 allocations: 762.940 MiB, 3.11% gc time)
@time pmap(g,1:100_000_000);
# still running...

I my use case, I have a function g which performs stochastic simulations and is quite fast to execute. I want to run it million of times though.

I am sorry if this is a trivial question, but can someone give me a hint about this behaviour and possibly how to improve it.

Thank you

Best regards

johnh · September 5, 2019, 2:58pm

Can you communicate with those distributed processes? Are they running on your local machine? I guess so…
Is it possible that you have used a lot of memory and have started swapping on your local machine?

rveltz · September 5, 2019, 3:10pm

You are right. Single machine multiprocessors.

affans · September 5, 2019, 3:24pm

It may have to do with the overhead of pmap. I’ve heard in general that pmap should be used when the function g does a large amount of work to offset the cost of overhead. In your case, apparently, the function g is much faster than the time it takes to send out the work to different processors. If you want to parallelize a fast function (such as your g), either use low level primitives such as spawnat and fetch or macros like @parallel on your for loop.

johnh · September 5, 2019, 3:43pm

What do these retuen after you addprocs ?
nprocs()
workers()

johnh · September 5, 2019, 3:44pm

@affans makes an excellent point. If you are going to parallelize by distrributing functions you need to give each worker a ‘decent’ amount of work to do. May apology to non-English native speakers.
As @affans says the ration of the time taken to do the task should be greater than the time to comminucate, send out and return the data.

rveltz · September 5, 2019, 3:59pm

Using your suggestions, I got it to work better (not for the MWE posted here) by making g computing more. Something along those lines:

@everywhere g(x) = rand(1000_000)
@time pmap(g,1:100);

affans · September 5, 2019, 4:10pm

I still think that pmap is not the way to go here (in the context of your function). See this excerpt from the documentation:

Julia’s pmap is designed for the case where each function call does a large amount of work. In contrast, @distributed for can handle situations where each iteration is tiny, perhaps merely summing two numbers. Only worker processes are used by both pmap and @distributed for for the parallel computation. In case of @distributed for , the final reduction is done on the calling process.

Can you try @distributed for and see if it speeds up your result?

raminammour · September 5, 2019, 4:45pm

This thread may help too, it echos what was said above:

Topic		Replies	Views
Why is the parallel map so slow? General Usage parallel , optimization , pmap	2	3219	May 10, 2020
Weird behavior of pmap General Usage	5	1424	July 2, 2019
Pmap usage Performance question , parallel	1	361	December 13, 2020
Lack of improvement from distributed pmap, understanding a simple example New to Julia distributed , pmap	6	155	October 29, 2024
Poor speed gain using `pmap` Performance parallel , pmap	17	1192	August 6, 2021

Struggling with pmap

Related topics