Distributed functions, but never run concurrently

torrance · May 5, 2022, 2:29am

I have a function N functions f() that I want to run across a distributed set of M worker processes, where N functions > M workers.

Naively, I tried:

for _ in 1:N
    @spawnat :any f()
end

However, this runs f() concurrently on the worker processes, which I don’t want since f() uses too many resources for this. Ideally, I want this loop to block at @spawnat if there are no idle workers.

Using a worker pool and pmap seems to solve this. e.g.:

addprocs(2)
pmap(1:4) do i
    println("Starting... $(i)")
    sleep(5)
    println("Done $(i).")
end

  From worker 2:    Starting... 1
  From worker 3:    Starting... 2
  From worker 2:    Done 1.
  From worker 3:    Done 2.
  From worker 3:    Starting... 3
  From worker 2:    Starting... 4
  From worker 2:    Done 4.
  From worker 3:    Done 3.

This only sends work to a process once it is idle. That’s what I want.

However, pmap() is not really ergonomic in my case. So I saw there is remotecall(f, ::WorkerPool) which the documentation suggests behaves similarly: “Wait for and take a free worker from pool and perform a remotecall on it.”

However, remotecall() doesn’t behave like the documentation would suggest. E.g.:

addprocs(2)
wp = default_worker_pool()
for i in 1:4
    remotecall(wp) do
        println("Starting... $(i)")
        sleep(5)
        println("Done $(i).")
    end
end

  From worker 2:    Starting... 1
  From worker 3:    Starting... 2
  From worker 3:    Starting... 4
  From worker 2:    Starting... 3
  From worker 3:    Done 2.
  From worker 2:    Done 1.
  From worker 2:    Done 3.
  From worker 3:    Done 4.

Am I doing something wrong? Or is the documentation wrong/misleading?

greg_plowman · May 5, 2022, 3:17am

You can use remotecall_wait or remotecall_fetch asynchronously.
Does this do what you want?

@sync for i in 1:10
    @async remotecall_wait(wp) do
        println("Starting... $(i)")
        sleep(5)
        println("Done $(i).")
    end
end

I think this is similar to the way pmap works, so curious why pmap doesn’t suit your use case.

Oh, I guess you want to call different functions?

torrance · May 5, 2022, 6:13am

@greg_plowman Yes, pmap() isn’t flexible enough for my needs.

Using @async remotecall_wait() does indeed work, but I feel like if the intended behaviour of remotecall() is as it currently is, then the documentation is misleading.

torrance · May 6, 2022, 2:03am

I have created a bug report to clarify the documentation here: https://github.com/JuliaLang/julia/issues/45203

Topic		Replies	Views
@spawnat and remotecall equivalence? New to Julia	2	566	September 5, 2019
Differences between @spawnat, remotecall_wait, remotecall_fetch Julia at Scale parallel , distributed	0	448	February 2, 2022
Why don't remotecalls to a WorkerPool block when all workers are busy? Julia at Scale parallel	6	1052	September 3, 2021
How to Maximize CPU Utilization - @spawn Assigning to Busy Workers - Use pmap Instead Julia at Scale parallel , distributed	17	3035	November 17, 2021
Attaching workers to cores General Usage distributed	10	485	July 22, 2020

Distributed functions, but never run concurrently

Related topics