Distributing a job across a cluster with SlurmClusterManager.jl

I asked about this on the slack channel, but didn’t get a response.

I’ve been using SlurmClusterManager and been pretty satisfied with it, but there’s a job I want to accomplish with it that I’m not sure how to do.

The job I want to do has m cases. each of these cases can be done independently, and I would like to assign a whole node, with all its cores n, to work on each case. If each of the m cases only required 1 core, then, I would have run my sbatch command with --ntasks=m and --cpus-per-task=1 ; the Julia code would then work fine with the naive pmap command.

Any suggestions would be appreciated. As indicated, my cluster uses Slurmn for scheduling.

1 Like

If I am not wrong you want the --exclusive flag

I think I know how to get the resources I want from the cluster manager, in terms of getting m nodes and all n cores on each of those nodes. What I’m more concerned with is how, presumably within Julia, I get it to assign one piece of the job to node 1, using all n cores on node 1, another piece of the job to node 2, using all n cores on node 2, etc.