sorry if this has already been answered somewhere, but my quick search didn’t really give me an answer.
I was wondering if there is any possible performance regression when setting
JULIA_NUM_THREADS larger than the actual available threads. On my laptop there’s no issue, as Julia automatically sets it too the maximum threads available, even if I set the environment variable too large. For the cluster I’m using I allocate some of the threads and set the number accordingly. The issue is that the number of allocated threads is not the same for Intel and AMD cores, i.e. when allocating 16 “CPUs” through the LSF manager I get 16 threads on AMD and 32 on Intel. Should I just always set
JULIA_NUM_THREADS to 32 and it doesn’t matter for the AMD cores, or are there any caveats? I don’t know prior to the run on which node the job will run.
The number of threads in Julia will be capped to the actual number of threads:
% JULIA_NUM_THREADS=100 julia -e 'using Base.Threads; @show Threads.nthreads()'
Threads.nthreads() = 8
Note that you may experience some degradation of performance if you hyperthread, but that depends on your actual workload
I think this is changed for 1.6.
Okay I haven’t checked this, but how does Julia know how many threads I have allocated on a node with more threads available to others?
I’m not sure I understand the question: don’t you launch separate processes on each node?
Yes I do. The question is the following: Should I set my
JULIA_NUM_THREADS to the number of possible hyperthreads with an Intel processor, or should I try to get the right number for the AMD processors? Would there be a noticable difference between 2x too many threads? Maybe this makes more sense if I highlight also that these number of threads are not all the threads available on one node, just the ones I am able to use.