Heya,
Here’s a code file ex1.jl
that does what it’s supposed to when I run julia ex1.jl
:
module MWE
export gen_complicated_thing, do_para
function complicated_helper()
return Dict()
end
function gen_complicated_thing(i)
d = complicated_helper()
d[i] = i
return d
end
function do_para(n, numprocs) # numprocs doesn't do anything in this example
arr = []
for i in 1:n
g = gen_complicated_thing(i)
push!(arr, g)
end
return arr
end
end
using .MWE
println(gen_complicated_thing(19283)) # just checking gen_complicated_thing works
println(do_para(10, 2))
However, I want to parallelize the do_para
function (none of the data depends on each other, so doing so is “trivial”). If I don’t set the numprocs inside the function and didn’t want a module, I could have a file ex2.jl
:
using Distributed
addprocs(2)
@everywhere function complicated_helper()
return Dict()
end
@everywhere function gen_complicated_thing(i)
d = complicated_helper()
d[i] = i
return d
end
function do_para(n, numprocs) # numprocs doesn't do anything in this example
arr = pmap(gen_complicated_thing, 1:10)
return arr
end
println(gen_complicated_thing(19283)) # just checking gen_complicated_thing works
println(do_para(10, 2))
But this is not what I want.
So my question is, how do I set up my module to use addprocs
and pmap
inside do_para
? I have to addprocs
before @everywhere
right? Do I need to reload a part of the module inside do_para
so I can @everywhere
what I need or something? Or is there another approach using something other than pmap
?
Edit:
I could use Threads.@threads
and julia -t 2 ex1.jl
or something similar, but I would really like to specify the number of threads/processes inside the function.
function do_para(n, numprocs) # numprocs doesn't do anything in this example
arr = Array{Any}(undef, n)
Threads.@threads for i in 1:n
g = gen_complicated_thing(i)
arr[i] = g
end
return arr
end