This is a companion post to https://github.com/JuliaLang/julia/pull/34852 , for a more user-centric discussion.
Reproducibility of randomized computations is a very nice property (when
Random.seed! is used). This is good for testing, debugging, and generally for reproducing results. Unfortunately, reproducibility is currently somewhat at odds with performance: If you spawn any
@async tasks, or use multithreading, then you normally don’t get reproducibility (even if there are no race conditions in your user code). This is
because the speed in which tasks finish, and how the scheduler switches between tasks, is fundamentally unpredictable (depends on randomized algorithm in the OS kernel and microarchitectural state of your CPU cores). Schedulers are inherently racy.
because of shared global state – the random generator – that is mutated by all your tasks (julia tasks, also called coroutines). For performance reasons, this is not how the julia default RNG works: Instead of one global RNG, there are
nthreads()many, and each task mutates the RNG that belongs to the thread where the task happened to be scheduled. This, however, does not improve the reproducibility situation.
One obvious solution is to hand each computational
Task its own RNG instance (discarded on task termination), such that there is no shared state; this RNG instance gets seeded from the parent RNG, which gets advanced in the process. With this, one can write computations that are independent of scheduler decisions.
Doing this by hand is somewhat cumbersome: User code would be responsible for all this book-keeping and passing-around of RNG instances. An alternative is to make the julia runtime do this work. This incurs a small (negligible?) overhead on task creation, and necessitates a change in the RNG algorithm (Mersenne twister takes too much memory). In the proof-of-concept, one uses
rand(Random.TaskLocal()) to access the multithreading-reproducible random stream.
So, my question for the people here on discourse who run or maintain infrastructure for randomized computations: How do you currently deal with reproducibility vis-a-vis multithreading? Is this something only a tiny minority cares about? Is this a killer feature that you didn’t think was possible?