Spent last week fighting with issues on a slurm cluster, and having finally figured it out, I wanted to share the result (and a warning):
I had been parallelizing through a slurm batch script with this call:
julia --machinefile $SLURM_NODEFILE indiv_array.jl
(For full script, see here)
Turns out (The Vanderbilt IT team and I have discovered) the problem with this strategy is that because julia opens new processes using
ssh, they escape SLURM’s notice. As a result, my parallel workers were running outside of SLURMs awareness (technical term is, I believe, outside the
cgroup), taking up memory unexpectedly and not always shutting down when
scancel was called on the main task (at one point I apparently had >30 zombie processes running on the research cluster, even though my
squeue was clean).
ClusterManager.jl solves this, but doesn’t seem to work well for busy slurm clusters (that generally require use of
sbatch script and long waits for resources) since it uses
CC: @ChrisRackauckas @raminammour
srun should be fine since that is the correct way of starting a job.
The workflow should work something like this:
salloc | sbatch # create resources.
julia> addprocs(SlurmManager(2)) # SlurmManager should inherit the outside allocation.
OH! So an
srun executed inside a slurm allocation doesn’t try to create a new allocaiton; it start processes in that existing allocation?
Yes. See https://slurm.schedmd.com/srun.html
Run a parallel job on cluster managed by Slurm. If necessary, srun will first create a resource allocation in which to run the parallel job.
srun in general is the right way of starting jobs within an allocation and crucially within
When the job allocation is finally granted for the batch script, Slurm runs a single copy of the batch script on the first node in the set of allocated nodes.
That’s why a
sbatch script has usually one or several
srun command in it.
This conversation is revelatory. Thank you!!
Huh, I didn’t know that would work either. Awesome @vchuravy. One note: does default
addprocs() do the correct thing like
addprocs(SlurmManager(2)) when in a cluster job? What I mean is, does
addprocs() automatically recognize that it should use the
SlurmManager with 2 process when it’s called from a SLURM job with 2 cores, or is that asking too much?
That is asking to much. We would have to redefine
addprocs when loading ClusterManager.
A post was split to a new topic: Issues running on a PBS cluster