I have a function N functions f()
that I want to run across a distributed set of M worker processes, where N functions > M workers.
Naively, I tried:
for _ in 1:N
@spawnat :any f()
end
However, this runs f() concurrently on the worker processes, which I don’t want since f() uses too many resources for this. Ideally, I want this loop to block at @spawnat
if there are no idle workers.
Using a worker pool and pmap
seems to solve this. e.g.:
addprocs(2)
pmap(1:4) do i
println("Starting... $(i)")
sleep(5)
println("Done $(i).")
end
From worker 2: Starting... 1
From worker 3: Starting... 2
From worker 2: Done 1.
From worker 3: Done 2.
From worker 3: Starting... 3
From worker 2: Starting... 4
From worker 2: Done 4.
From worker 3: Done 3.
This only sends work to a process once it is idle. That’s what I want.
However, pmap()
is not really ergonomic in my case. So I saw there is remotecall(f, ::WorkerPool)
which the documentation suggests behaves similarly: “Wait for and take a free worker from pool and perform a remotecall on it.”
However, remotecall()
doesn’t behave like the documentation would suggest. E.g.:
addprocs(2)
wp = default_worker_pool()
for i in 1:4
remotecall(wp) do
println("Starting... $(i)")
sleep(5)
println("Done $(i).")
end
end
From worker 2: Starting... 1
From worker 3: Starting... 2
From worker 3: Starting... 4
From worker 2: Starting... 3
From worker 3: Done 2.
From worker 2: Done 1.
From worker 2: Done 3.
From worker 3: Done 4.
Am I doing something wrong? Or is the documentation wrong/misleading?