Distributed functions, but never run concurrently

I have a function N functions f() that I want to run across a distributed set of M worker processes, where N functions > M workers.

Naively, I tried:

for _ in 1:N
    @spawnat :any f()
end

However, this runs f() concurrently on the worker processes, which I don’t want since f() uses too many resources for this. Ideally, I want this loop to block at @spawnat if there are no idle workers.

Using a worker pool and pmap seems to solve this. e.g.:

addprocs(2)
pmap(1:4) do i
    println("Starting... $(i)")
    sleep(5)
    println("Done $(i).")
end

  From worker 2:    Starting... 1
  From worker 3:    Starting... 2
  From worker 2:    Done 1.
  From worker 3:    Done 2.
  From worker 3:    Starting... 3
  From worker 2:    Starting... 4
  From worker 2:    Done 4.
  From worker 3:    Done 3.

This only sends work to a process once it is idle. That’s what I want.

However, pmap() is not really ergonomic in my case. So I saw there is remotecall(f, ::WorkerPool) which the documentation suggests behaves similarly: “Wait for and take a free worker from pool and perform a remotecall on it.”

However, remotecall() doesn’t behave like the documentation would suggest. E.g.:

addprocs(2)
wp = default_worker_pool()
for i in 1:4
    remotecall(wp) do
        println("Starting... $(i)")
        sleep(5)
        println("Done $(i).")
    end
end

  From worker 2:    Starting... 1
  From worker 3:    Starting... 2
  From worker 3:    Starting... 4
  From worker 2:    Starting... 3
  From worker 3:    Done 2.
  From worker 2:    Done 1.
  From worker 2:    Done 3.
  From worker 3:    Done 4.

Am I doing something wrong? Or is the documentation wrong/misleading?

You can use remotecall_wait or remotecall_fetch asynchronously.
Does this do what you want?

@sync for i in 1:10
    @async remotecall_wait(wp) do
        println("Starting... $(i)")
        sleep(5)
        println("Done $(i).")
    end
end

I think this is similar to the way pmap works, so curious why pmap doesn’t suit your use case.

Oh, I guess you want to call different functions?

@greg_plowman Yes, pmap() isn’t flexible enough for my needs.

Using @async remotecall_wait() does indeed work, but I feel like if the intended behaviour of remotecall() is as it currently is, then the documentation is misleading.

I have created a bug report to clarify the documentation here: Documentation of Distributed.remotecall() misleading · Issue #45203 · JuliaLang/julia · GitHub