ThreadedEx
uses threads (Threads.@spawn
) and DistributedEx
uses multiple processes (Distributed.jl). Both let us use multiple CPUs but the latter let us use multiple machines. But if you are using only one machine, I’d try threading first. It’s much easier to share things and has less overhead.
But, unfortunately, choosing threading vs multi-processing is a bit complicated in Julia and case-specific. I’m trying to explain a useful guideline here: Frequently asked questions
I’m not a PRNG specialist but here’s my understanding . Recall that all PRNGs are “just” clever state machine with a large but finite period (the minimum number of iterations that takes for the PRNG to come back to the initial state). What is specific about counter-based PRNG is that you can set the state of the PRNG to the arbitrary point (“phase”) of the period with a very small computation. That’s the set_counter!
function I’m calling. So, since we know the period of the PRNG and there is a way to choose where to start, it lets us evenly split the entire period into chunks and use each chunk locally within a thread.
I’m introducing the inner loop because set_counter!
has an overhead comparable to the actual random number generation. So, it makes sense to use the RNG for a while after you set the counter.