I have a question regarding the setup of hybrid parallelism which combines multithreading and distributed computations.
Let’s say I want to run some computation on a machine with Non-Uniform Memory Access. In this case, it is a good idea to have a julia process per NUMA node, each of the processes employing multiple threads which run on the same NUMA node.
I know that addprocs
allows one to specify the cpu affinity of the additional julia processes: these way I can pin the processes to different NUMA nodes. Besides that, one can pass to addprocs
special flag to use multithreading for these processes. My question is: how does one specifies the affinity of these threads? (So that they are pinned to the same NUMA node as the process they belong to.)