Requesting idle workers to speed up unbalanced processes with pmap

bchiem · March 20, 2018, 3:57pm

Dear Julia users,

I am currently coding an algorithm that performs the same task on several pieces of a dataset in parallel using pmap, and then gathers the results in a SharedArray. Due to the exploratory nature of the algorithm, the execution time of the task can highly vary depending on the part of the dataset on which it executes, and I have no way to predict it.

Hence when running my code on a cluster, at some point, it turns out that a few workers are still busy, while all the others are idle.

My question is: is it possible for busy workers to somehow “request the help” of idle workers, since other parts of the task to be performed can be parallelized?
In other words: can a worker process call in turn pmap? How to define the pool of idle workers inside a pmap call?

I am kind of a newbie with Julia and parallel computing, so I hope my question is clear enough.

Thank you in advance for your help!

Janis_Erdmanis · March 20, 2018, 4:03pm

You are looking for redefining pmap since built in works on a chunks. Fortunately in documentation that is already done https://docs.julialang.org/en/stable/manual/parallel-computing

bchiem · March 20, 2018, 4:57pm

Thanks for your answer!

Are you talking about this piece of code?

function pmap(f, lst)
    np = nprocs()  # determine the number of processes available
    n = length(lst)
    results = Vector{Any}(n)
    i = 1
    # function to produce the next work item from the queue.
    # in this case it's just an index.
    nextidx() = (idx=i; i+=1; idx)
    @sync begin
        for p=1:np
            if p != myid() || np == 1
                @async begin
                    while true
                        idx = nextidx()
                        if idx > n
                            break
                        end
                        results[idx] = remotecall_fetch(f, p, lst[idx])
                    end
                end
            end
        end
    end
    results
end

I don’t really see here how I can modify the code so that busy workers ask idle workers in the pool to help them. Maybe I am missing something simple…

phlavenk · March 20, 2018, 5:08pm

It’s OK to pmap a function that inside executes an other pmap of a subfunction on more granular computing tasks. Under the hood, the top level pmap divides the array into small chunks. And this chunks get divided once more into the more granular computing tasks and those are qeued. Then, all workers process the queued tasks until there is nothing in the queue left.
This is a common pattern.

Janis_Erdmanis · March 20, 2018, 5:24pm

Yes, that is the code. There would be no idle workers if one keeps feeding them one by one when they are finished with previous task.

pasha · March 20, 2018, 6:23pm

That alternative pmap function code, which I’ve used and modified myself, is good at scheduling new jobs to idle workers. But it can’t take the work from an already-busy worker and somehow give some of that work to an idle work. It wouldn’t know how to divide up your task or how to move data efficiently, if it could somehow do that. It comes down to you designing and monitoring your tasks to be a good balance of data transfer and task work. Maybe you can divide things up differently?

With regard to nested pmap: if all 32 workers are busy with a primary job, then what happens when one of those workers starts a nested pmap? It seems to me that there are no more workers available, and the task would run on the primary job’s thread, so there’d be nothing gained in that full scenario. I can see how they might help if the workers weren’t all busy.

phlavenk · March 20, 2018, 8:01pm

The pmap does nothing magical - magics happens with @async. Here you give the control to the task scheduler. And with nested pmap you do it once more. And even the nested pmap sees the whole worker pool…

bchiem · March 21, 2018, 4:21pm

Thank you all for your answers!
Indeed, I’ll look to my code to see if I can divide up the tasks more efficiently.

However, about the fact that the nested pmap sees the whole worker pool, I tried the following code to test it:

addprocs(6)
pmap((i)->(sleep(i);println(default_worker_pool().workers)),1:3);

The results are:

From worker 2:	Set{Int64}()
From worker 4:	Set{Int64}()
From worker 5:	Set{Int64}()

So if I understand well, only 3 workers should be busy here; and therefore they should see workers 3,6,7 as being idle, and maybe available for other tasks. So why does it return an empty set of idle workers?

Janis_Erdmanis · March 21, 2018, 6:21pm

I wonder what situation do you have in which you would like to submit a task (some function to calculate) from a worker process?

tk3369 · March 21, 2018, 8:12pm

Also see this thread.

Topic		Replies	Views
Behavior of worker pool in pmap Performance pmap	2	889	November 25, 2018
Lack of improvement from distributed pmap, understanding a simple example New to Julia distributed , pmap	6	141	October 29, 2024
Trying to understand communication between workers in simple version of pmap General Usage distributed , pmap	4	891	January 3, 2020
Pmap but with control over which workers do the tasks? Julia at Scale question	0	410	January 12, 2020
Distributed nested in Pmap Julia at Scale	2	1112	February 22, 2019

Requesting idle workers to speed up unbalanced processes with pmap

Related topics