I have a Julia script which computes many linear programming problems using Gurobi+JuMP in parallel. When I test my code on HPC and submit the following slurm script:
#!/bin/bash
#SBATCH --job-name=xz49
#SBATCH --partition=interactive
#SBATCH --nodes=1
#SBATCH --export=ALL
#SBATCH --ntasks-per-node=10
#SBATCH --mem-per-cpu=4G
#SBATCH --time=00:30:00
julia --machine-file <(srun hostname -s) gurobi.jl
I got the following error:
ERROR: LoadError: On worker 2:
Invalid Gurobi license
error at ./error.jl:33
Env at /home/xz49/.julia/packages/Gurobi/GeYlA/src/grb_env.jl:19
top-level scope at none:0
eval at ./boot.jl:331
#101 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/Distributed/src/process_messages.jl:290
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/Distributed/src/process_messages.jl:79
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/Distributed/src/process_messages.jl:88
#94 at ./task.jl:358
...and 9 more exception(s).
However, if I change my slurm script to the following, it works fine:
#!/bin/bash
#SBATCH --job-name=xz49
#SBATCH --partition= interactive
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=10
#SBATCH --threads-per-core=1
#SBATCH --mem-per-cpu=4G
#SBATCH --time=00:30:00
srun julia -p $SLURM_CPUS_PER_TASK gurobi.jl
So, both of them should parallel on 10 CPUs on a single node. But the machine-file approach does not work. I wonder how I should set up Gurobi environmental variable in the first case? Also, in the second case, I can utilize hyper-threading by
srun julia -p $(( 2 * $SLURM_CPUS_PER_TASK )) gurobi.jl
so that my Julia script will parallel on 20 physical/virtual CPUs. I wonder how I could achieve this using machine-file? Thank you very much.