Running multiple instances of an already parallel external program in parallel

I have a situation like below:


nrep = 3
Threads.@threads for i in 1:nrep
    run(`bash -c """cd $path && ./external_program"""`)


#pragma omp parallel for num_threads(10)
for (int i = 0; i < 10; i++){
     // do stuff

Also, my cpu has 32 cores (64 threads). Julia is running with 32 threads.

Everything works fine, but I get way more overhead if nrep*num_threads > 32 than when running nested parallel loops all from julia.

I’m not sure what exactly is going on in the background, but I’m guessing the latter case is composable while my problem is not.

Is that understanding correct? Is there a way to address this?

Yes Julia has composable multithreading

To fix this you either need to control the number of threads used by your external programs and how many you start to avoid oversubscription or just use Julia and don’t waste brain cycles on the thread logistics :wink:

1 Like