I’m trying to get my head around distributed programming to execute a script on a cluster.
My code works well on my workstation with one CPU. It internally uses
@threads to parallelize some stuff within the main task, using the threads of my one CPU. So I start as follows:
$ julia -t 12.
On the cluster, I would like to launch several of this task (each will use
Now I’m reading the documentation of distributed programming and I read
julia -p nprovides
nworker processes on the local machine. Generally it makes sense for
nto equal the number of CPU threads (logical cores) on the machine. Note that the
-pargument implicitly loads module
Say I have 4 CPUs on my cluster node. Should I start with
$ julia -p 4 -t 12(assuming 12 cores per cpu) or
$ julia -p 48? The latter takes the above quote literally.
I tried 2) on my local machine and I see that
nthreads() == 1. 1) gives
nprocs() == 4 && nthreads == 12.
Am I correct to think that with 1) I can start 4 parallel “main” tasks" that will each use 12 threads for multithreading or will it be 3 because the 12 threads are divided between the workers ? Meaning I should use
$ julia -p 4 -t 48 instead.
Am I correct to think that with 2) I will launch 48 “main” tasks that will not be able to multithread my
@threads calls because
nthreads() == 1 then?