Running Julia in a SLURM Cluster

My preferred way is something like this:

#!/usr/bin/env sh
#SBATCH -N 10
#SBATCH -n 8
#SBATCH -o %x-%j.out
#=
srun julia $(scontrol show job $SLURM_JOBID | awk -F= '/Command=/{print $2}')
exit
# =#

using MPIClusterManagers
MPIClusterManagers.start_main_loop(MPI_TRANSPORT_ALL)

println(workers()) # should have 80 workers here across 10 nodes (controlled by -n and -N above)

You put this in myscript.jl and then sbatch myscript.jl.

This is using A neat Julia/SLURM trick and MPIClusterManagers.jl.

ClusterMangers’s ElasticManager is also quite useful for dynamically hooking up workers to e.g. a Jupyter session, if you prefer the interactive workflow.

7 Likes