Are there any Julia parallel frameworks which allow adding workers while a computation is running?

I have a large trivially parallel computation I’m running with pmap. Say its been going for a while with 64 workers, then another 64 cores become available on a shared cluster.

Are there any Julia frameworks that allow me to attach 64 new workers to the running process for a total of 128, and have the scheduler just get those into the loop so the pmap finishes the remaining computation twice as fast as it would have?

2 Likes

You can give the “elastic” parallel map in Schedulers.jl a try. I usually use it with AzManagers.jl. But, it should work with any cluster manager. There are some caveats with respect to what type of code is auto-loaded onto the new workers that you need to be careful of.

3 Likes

That looks great, thank you! Going to take a look at it.

I was chatting on slack with @jpsamaroo and he proposed:

using Distributed, ClusterManagers
# Start and initialize a few workers
ws = addprocs(
    SGEManager(2, `-l avx512 -l h_vmem=3G`, projectdir()); topology=:master_worker
)
@everywhere ws begin
    # Init worker
end
# Create initial worker pool
pool = WorkerPool(ws)
t = @async begin
    # Start and initialize additional workers
    new_ws = addprocs(
        SGEManager(100, `-l avx512 -l h_vmem=3G`, projectdir()); topology=:master_worker
    )
    @everywhere new_ws begin
        # Init worker
    end
    # Add to WorkerPool
    for w in new_ws
        push!(pool, w)
    end
end
result = @showprogress pmap(sweep_range, pool) do parameter
    heavy_computation(parameter)
end

However, I was trying it out today and the hurdle I have now is statement like using PackageA need to be at top level. Hence, syntax as

t = @async begin
    @everywhere using PackageA
end

is not valid. If you don’t need to load packages at every worker the code works.

Have you tried

t = @async begin
    @everywhere @eval using PackageA
end
1 Like

@samtkaplan How does shedulers handle the prerequisites that workers needs, e.g., @everywhere using PackageA? Will it automatic run the previous everywhere statements?

Edit: Nevermind, you can provide an init method.

@eval works :slight_smile:

Actually, it doesn’t use the “init” method for that. It tries to detect what is loaded and automatically load those things on new workers. It does not work for everything (e.g. struct definitions that are not defined in a package). There are two functions that try to do this auto loading:

The “init” merhod is used for things like loading data from storage onto the new machine.