Rather than using sbatch
or srun
commands, I would use ClusterManagers.jl
package. That being said, it is probably very important that you understand how srun
and sbatch
work since thats what ClusterManagers.jl
uses under the hood. It will also help you manage your cluster better. Next I would recommend reading the Multi-processing and Distributed Computing · The Julia Language manual to understand functions and methods for running parallel code.
After that, it’s a fairly easy process. First add your processors.
using ClusterManagers
addprocs(SlurmManager(nProcs), N = nNodes, other kwargs...)
Then define your computationally function on all worker processes.
@everywhere function work(sim_id)
#do heavy expensive calculation here
end
Then use the high level, user friendly pmap
function to run your over the workers, i.e.
pmap(x -> work(x), 1:nsims)
See my reply here for more information: Parallel programming capabilities in Julia - Usage / First steps - JuliaLang