I am trying to run a simulation in parallel and I am having trouble understanding how random numbers are generated in this setting. It seems like I can generate pseudorandom numbers in several cores with the same random seed:
using Distributed
addprocs(1)
@everywhere using SharedArrays, Random
A = SharedArray{Float64}(10,10)
rng = MersenneTwister(1)
@sync @distributed for i in 1:10
A[:,i] = rand(rng,10)
end
The output is equivalent when using a single core:
B = Array{Float64}(undef,10,10)
rng = MersenneTwister(1)
for i in 1:10
B[:,i] = rand(rng,10)
end
A==B #true
Why does specifying a single random seed work and why do I get the same result? I thought if two parallel processes accessed and modified the state of the same random number generator, I would get an error or at least a different result since it is accessed in a different order. Am I doing something wrong and my code is not really running in parallel?
Without cmd line option -p I can reproduce your finding with addprocs(1).
-p, --procs {N|auto} Integer value N launches N additional local worker processes
"auto" launches as many workers as the number of local CPU threads (logical cores)
You can see by filling the array with the worker IDs:
using Distributed
addprocs(1)
@everywhere using SharedArrays, Random
A = SharedArray{Int}(10,10)
rng = MersenneTwister(1)
@sync @distributed for i in 1:10
A[:,i] .= myid()
end
A
This I can’t answer and it seems to be complex. See this old discussion:
And this one not so old:
What you actually need now is different random numbers for each worker but reproducible… (still looking)
This looks good:
using Distributed
addprocs(4)
@everywhere using SharedArrays, Random
A = SharedArray{Float64}(10,10)
@everywhere Random.seed!(myid())
@sync @distributed for i in 1:10
A[:,i] = rand(10)
end
Thanks a lot for the answer. I thought nprocs() would be the number of parallel workers while nworkers() would exclude the main process. You are right that when using more workers A!=B because each worker starts with the same seed, so I get several copies of the same random numbers.
Specifying the seed with the worker id may be dangerous because it is not reproducible with different number of workers. The following is safer but it is less efficient by creating a new seed in every loop:
A = SharedArray{Float64}(10,10)
@sync @distributed for i in 1:10
rng = MersenneTwister(i)
A[:,i] = rand(rng,10)
end
Absolutely true. But different number of workers does imply that it doesn’t reproduce. Reproducable with same conditions and environment of course. But even with same number of workers it may not reproduce, because the IDs may differ.
This should do:
using Distributed
addprocs(4)
@everywhere using SharedArrays, Random
function remote_seed()
ids=SharedArray{Int}(nworkers())
#now I assume that all workers will be used once:
@sync @distributed for i in 1:nworkers()
ids[i]=myid()
end
println(ids)
for i in 1:length(sort(ids))
remotecall(Random.seed!,ids[i],i)
end
end
remote_seed()
A = SharedArray{Float64}(10,10)
@sync @distributed for i in 1:10
A[:,i] = rand(10)
end
At least the seeding should be the same for each run.
As the distribution of the single tasks in a @distributed loop on the workers is not well defined it still may not reproduce the random numbers, but this could be achieved with calling remotecall for rand as done with Random.seed!.
Here ends my interest in the topic of reproducablilty of random numbers in a distributed environment