Hi @evan-wehi, I’ve been playing around a bit with this, but I have one question before I write up what I’ve found: When you write @everywhere levels = rand(np, N), do you actually intend each worker to sample its own instance of levels? Because that’s what @everywhere accomplishes, and it’s different from what you’re doing in the multithreaded case, where all threads share the same levels. That said, these independent instances of levels are not actually used in the rest of your code—they are overwritten by the instance of levels captured in the doit closure, which is the one sampled on the main process. So in the end, your multiprocessing and multithreaded codes accomplish the same thing, however, the multiprocessing code does a lot of unnecessary sampling work up front that you could have eliminated by removing all occurrences of @everywhere within main_procs.
So to clarify, should each parallel thread/process share the same levels or use independently sampled instances?
Here's an MWE illustrating what I'm talking about above
julia> using Distributed
julia> function f()
@everywhere x = rand() # Samples x independently on each worker
printlocal(_) = println(@eval(x)) # Prints the worker's own instance of x
printmain(_) = println(x) # Copies x over from the main process and prints it
pmap(printlocal, 1:3)
println()
pmap(printmain, 1:3)
println()
pmap(printlocal, 1:3) # Note how all workers now have the same x due to printmain
return nothing
end;
julia> f()
From worker 3: 0.5299608462035816
From worker 4: 0.3448114078538431
From worker 2: 0.7929854408997841
From worker 3: 0.017534970889111157
From worker 2: 0.017534970889111157
From worker 4: 0.017534970889111157
From worker 4: 0.017534970889111157
From worker 2: 0.017534970889111157
From worker 3: 0.017534970889111157