Odd thread calling behavior on HPC

I’ve been running Julia on my university’s HPC using a self-installed version of Julia (through juliaup) and have noticed odd thread calling behavior.

When running the following Julia script

a = zeros(10)
Threads.@threads for i = 1:10
    a[i] = Threads.threadid()
end

sleep(120)

println(a)

using this bash script

#!/bin/bash

#SBATCH --job-name="Multi-threading"
#SBATCH --time=00:05:00
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --partition=compute
#SBATCH --mem-per-cpu=1GB
#SBATCH --account=innovation
#SBATCH --mail-type=END     # Set mail type to 'END' to receive a mail when the job finishes. 

set -x

export SRUN_CPUS_PER_TASK="$SLURM_CPUS_PER_TASK"

srun julia --project=examples --threads $SRUN_CPUS_PER_TASK examples/multi_threading.jl > examples/multi_threading.log

I would expect Julia to use a single thread (and can indeed confirm that $SRUN_CPUS_PER_TASK=1). Instead, it seems that Julia is using 25 threads (6th column is NLWP, so as far as I understand it relates to the number of processes used):

09:38:20 paltmeyer@login04 julia-hpc-for-dummies ±|main|→ sbatch examples/multi_threading_blue.sh 
Submitted batch job 2732345
09:38:27 paltmeyer@login04 julia-hpc-for-dummies ±|main|→ srun --jobid=2732345 --overlap --pty bash
09:38:34 paltmeyer@cmp008 julia-hpc-for-dummies ±|main|→ ps -eLF | grep paltm
paltmey+  646015  646010  646015  0    1  3217  2904  11 09:38 ?        00:00:00 /bin/bash /cm/local/apps/slurm/var/spool/job2732345/slurm_script
paltmey+  646019  646015  646019  0    5 80711  5752  11 09:38 ?        00:00:00 srun julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646019  646015  646024  0    5 80711  5752  11 09:38 ?        00:00:00 srun julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646019  646015  646025  0    5 80711  5752  11 09:38 ?        00:00:00 srun julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646019  646015  646026  0    5 80711  5752  11 09:38 ?        00:00:00 srun julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646019  646015  646027  0    5 80711  5752  11 09:38 ?        00:00:00 srun julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646022  646019  646022  0    1  9579   780  11 09:38 ?        00:00:00 srun julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646043  646032  646043  0    2  1132  1528  11 09:38 ?        00:00:00 /home/paltmeyer/.juliaup/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646043  646032  646045  0    2  1132  1528  11 09:38 ?        00:00:00 /home/paltmeyer/.juliaup/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646046  3   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646054  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646056  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646057  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646058  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646059  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646060  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646061  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646062  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646063  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646064  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646065  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646066  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646067  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646068  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646069  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646070  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646071  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646072  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646073  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646074  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646075  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646076  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646077  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646046  646043  646078  0   25 885198 231384 11 09:38 ?       00:00:00 /scratch/paltmeyer/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/bin/julia --project=examples --threads 1 examples/multi_threading.jl
paltmey+  646112  646104  646112  2    1  7721  8660  11 09:38 pts/6    00:00:00 /usr/bin/bash
paltmey+  646499  646112  646499  0    1 14610  3700  11 09:38 pts/6    00:00:00 ps -eLF
paltmey+  646500  646112  646500  0    1  2954  1164  11 09:38 pts/6    00:00:00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn paltm

Note that in rows 2-6 of the output returned by ps -eLF | grep paltm we observe 5 Julia processes each calling 5 processes (hence the number 25).

When I scale up the number of requested threads e.g. cpus-per-task=3, the total number of threads goes up proportionately to 27 (similarly I’ve test cpus-per-task=5 |> 29, cpus-per-task=7 |> 31, …).

I’m still very new to HPC so I expect I’m just doing something silly or forgot to take care of something when self-installing Julia on the HPC.

Any help would be much appreciated!

Julia only uses one worker thread, but there may be other threads that are started. As an example OpenBLAS automatically starts with a certain set of tasks, as does LibUV.

1 Like

I think you’re right, it seems that simply adding export OPENBLAS_NUM_THREADS=1 does the trick. See also here. Thanks for the hint @vchuravy