I have a function which takes about 10 minutes to run. It is primarily in calling two external commands. I previously was using a bash script to run it. I need to run this function on several different inputs and would like to run them in parallel. Using map
gives the expected results, but when I use pmap
, it seems to be trying to parallelize the function itself, which I don’t want.
Further details:
The function definition looks like this:
@everywhere function getresult(x)
prefix = "si_$(x)"
# string to control pw
pw_in = """
prefix='$prefix'
"""
# string to control ph
ph_in = """
prefix='$prefix'
"""
# run functions
pw=open(`pw.x`,"r+") # create pw process
print(pw,pw_in) # send it the control info
close(pw.in)
scf_out=read(pw,String) # waits until pw is done -- 3 minutes
f_scf=open("./tmp/scfout_$prefix","w")
write(f_scf,scfout) # write pw.x results to file for later
close(f_scf)
ph=open(`ph.x`,"r+") # create ph process
print(ph,ph_in) # send it the control info
# this process reads the results from pw based on prefix
close(ph.in)
phGout=read(ph,String) # waits until ph is done -- 7 minutes
f_scf=open("./tmp/phGout_$prefix","w")
write(f_scf,phGout) # write ph.x results to file for later
close(f_scf)
return(prefix)
end
The results are all written to files which are labeled by prefix
which I define using the input variables. The process ph.x
depends on the results that pw.x
wrote, and knows where to look for them from its input string.
I try to run this function on multiple inputs using:
result = pmap(x->getresult(x), [1,2,3,4])
Julia has no problem, but the external processes fail because ph.x
tries to run before pw.x
has finished.
Some things I’ve tried:
- replacing
pmap
withmap
gives the expected result. - running the function with different inputs on separate julia processes at the same time gives the expected results–it does not appear the external processes are interfering with each other.
Could you all help me figure out how to fix this? I am unfamiliar with parallel processing lingo which is making it hard to understand the docs. It feels like julia is trying to parallelize more than possible.