I have been using Julia on an HPC slurm cluster with no problem. However, I was trying to use the MPI.jl package to parallelize my code. The cluster has intel MPI that I have been trying to use without success. I added the MPI.jl package using Pkg.add(“MPI”). My job submission script looks as follows:
#SBATCH --partition short
#SBATCH --N 1
...
module load intel/MPI
mpirun -n 100 julia script.jl
However, this does not work and I receive an error message saying:
Juliaup configuration is locked by another process, waiting for it to unlock.
The previous error message is repeated around 100 times in the error file “stderr”, and since I have 100 MPI ranks, this suggests that each MPI rank is waiting for the configuration file to get unlocked.
Now, Julia itself is not working anymore and I keep receiving the same message. I understand that my unsuccessful attempt to use MPI led to some kind of lock on a certain configuration file. However, I have cancelled the MPI jobs I have been trying to do, logged off and on again, but the issue is still there. I have two questions:
- How to properly use MPI.jl on a Slurm cluster? After checking online, it looks like I need to add another package “MPIPreferences.jl” to make this work? The documentation does not seem very clear to me.
- How to resolve this issue of locked file?