I’m working on an optimization problem on graphs using some code I ported from R to Julia.
What’s strange, is that when I run the code which uses stochastic processes to sample a solution space in a search for a Pareto front of optimal solutions, my code doesn’t always generate the same output, despite setting the Random.seed! value first.
The only direct dependencies I’m using are as follows: Reexport, Graphs, Distributed, StatsBase, LinearAlgebra , Random, Distributions, Pipe.
Do any of these not use the standard RNG to do their thing?
Also, this error persists even when I only use a single run with pmap on the @everywhere code, so I don’t think it has to do with the fact I’m trying to multithread it.
But if you do not try to multithread it at all, like not even use pmap/@everywhere, do the problem arises? I am focusing into that because it is very improbable that Julia Random is not being deterministic unless this has something to do with parallelism.
I can try it; however, the runtime of the algorithm tanks without parallelism since I have to generate thousands of graphs and consolidate them, and this normally runs on the super computer.
However, if I run pmap with collections of length 1, there shouldn’t be any way for it to vary right, since it only distributions the function calls over the collections across the threads right? That’s what I’m currently doing, and it’s causing the issue, but I’ll try to remove it and I’ll report back as soon as I can.
The default pseudorandom generator is Task-local. Is it possible that some sub-graphs of interest share Tasks? Perhaps your code is racy?
A possible path to a solution would be to allocate Xoshiro instances explicitly, instead of relying on the default generator. This shouldn’t be necessary, assuming there are no relevant bugs in either Julia or your code, but it could perhaps lead to recognizing a bug in your code, if any. Could be more performant, too.
Ok, so now it looks like I’m getting deterministic behavior…could this be caused by having multiple calls to “using Random”? In my original code, I reexport Random, and accidentally recalled it originally, but now I’m getting consistent (buggy) behavior as I try to replicate this. Huh.
Edit: To be clear, the change is that I didn’t re-call using Random this time, and now the performance is at least consistent, if not erroring out due to a bug.