Hello,
I’m trying to get my head around distributed programming to execute a script on a cluster.
My code works well on my workstation with one CPU. It internally uses @threads to parallelize some stuff within the main task, using the threads of my one CPU. So I start as follows: $ julia -t 12.
On the cluster, I would like to launch several of this task (each will use @threads occasionally).
Now I’m reading the documentation of distributed programming and I read
Starting with
julia -p nprovidesnworker processes on the local machine. Generally it makes sense fornto equal the number of CPU threads (logical cores) on the machine. Note that the-pargument implicitly loads moduleDistributed.
Say I have 4 CPUs on my cluster node. Should I start with
$ julia -p 4 -t 12(assuming 12 cores per cpu) or$ julia -p 48? The latter takes the above quote literally.
I tried 2) on my local machine and I see that nthreads() == 1. 1) gives nprocs() == 4 && nthreads == 12.
Am I correct to think that with 1) I can start 4 parallel “main” tasks" that will each use 12 threads for multithreading or will it be 3 because the 12 threads are divided between the workers ? Meaning I should use $ julia -p 4 -t 48 instead.
and
Am I correct to think that with 2) I will launch 48 “main” tasks that will not be able to multithread my @threads calls because nthreads() == 1 then?