My question is probably easy to solve, but I got confused by the several packages and proposals (@sync, @async, Distributed etc…). In the most simple case there is a script in Julia. This script has a loop that run bash script that run a compiled code (several times in a serial loop).
for param in [a1,a2,a3,......,a100]
basedir = Base.Filesystem.pwd()
run(`bash`) # this launch in serial several programs.
I would like to launch these bash scripts in parallel in such way that I am using all time k cores but not more. So it will launch initially k processes, and when one of the cores is free it run another of these bash until the for loop in param is completed.
If you are on Linux you could also run from Julia GNU Parallel that does exactly that: it takes a text file with commands (I use a single command per line, I don’t know if you can use it in other ways) and run the first k-cores commands and then as soon as one ends it continues with the remaining commands…
For example in my model I have a “” file that contains:
If you want to limit the number of processes run concurrently to k, you can simply use k tasks.
function parallel_run(commands; ntasks = Sys.CPU_THREADS)
request = Channel{Cmd}() do request
for cmd in commands
put!(request, cmd)
@sync for _ in 1:ntasks
@async try
foreach(run, request)
close(request) # shutdown on error
Also, as baggepinnen mentioned, don’t use cd since it mutates the global state. You can use setdnev(cmd; dir = ...) to set the directory for each command:
setenv(`pwd`; dir = ".."),
setenv(`pwd`; dir = "/tmp"),
All of these comments are solutions! I was not aware about the problem with the global state, this explain why my initial tentative was not working. At this moment @sylvaticus 's solution should work for my setting directly. And, as soon as, I translate some stuff to Julia, then the other solutions are going to be great. Thank you also for the link.