I would like to run a Julia script which makes use of multi-threading on a Sun Grid Engine cluster. I can reserve 2 cores, load Julia, and run the script with 4 threads by submitting
#!/bin/csh
#$ -pe smp 2
module load julia/1.7.1
julia --threads 4 xyz.jl > ...
The script runs correctly except that it takes it about 4 times as long to finish the calculation as using just a single thread, so it looks like Julia doesn’t actually distribute the treads over all of the available cores and so doesn’t run them in parallel. I don’t have this problem when running the script on my PC instead of on the cluster. What am I doing wrong?