Multiple independent random number streams

Ross_Boylan · April 27, 2023, 5:15pm

Goal

Across many, e.g., 10,000, simulations, generate random data. Each simulation should be independent of the others. Must be able to recreate a particular simulation s without first running previous s-1 simulations.

Ordinarily this tends to get discussed as parallel random number generation, but my application creates all data sequentially in the main thread.

My goal in this post is to get advice about how to do it.

I can think of several alternatives, none of which seem entirely satisfactory.

Alternatives

Simple minded seeding

For each simulation s do seed!(myseed+s).
This is easy and, I think, fairly conventional. But reportedly it’s not entirely reliable if the goal is real independence of the resulting streams.

abc123

This “counter” based approach was recommended in several places as being a good fit for this kind of problem. But it has spotty documentation: some links are dead, and some stuff like the actual arguments to the generators are not documented at all, or only mentioned in examples. And I don’t think it’s a drop-in replacement for other julia RNG’s, since it only produces blocks of bits (at least, that is all that’s documented).

hybrid

Use abc123 to set the seed for the main RNG. I have no idea if this would behave any better than the simple-minded seeding.

jump

I’ve seen some examples that use jump to move the random number stream to a new spot. This is not in the documented interface that I see, but if I jump by something much more than the number of random numbers each simulation draws that might work. Or maybe Future.randjump.

Prior Discussion

stevengj · April 27, 2023, 5:29pm

Maybe do the same thing but use a cryptographic hash to ensure that consecutive values of s lead to very different seeds? e.g.

using SHA, Random

# like Random.seed! but use an SHA256 cryptographic hash of seed
cryptoseed!(rng::AbstractRNG, seed::Base.BitInteger) = 
    Random.seed!(rng, Vector(reinterpret(UInt32, sha256(reinterpret(UInt8, [seed])))))
cryptoseed!(seed::Base.BitInteger) = cryptoseed!(Random.GLOBAL_RNG, seed)

Update: It looks like this is redundant. The default Xoshiro PRNG implementation in Random already does an SHA hash of the seed.

tbeason · April 27, 2023, 6:27pm

I don’t see why you’d do anything other than the simple/naive thing. In fact, if not for the requirement

I would tell you just to set the seed once at the beginning of the loop.

State of the art RNGs, like the one Julia uses, are good. Unless you are specifically working on some sort of cyber security project (sounds like you are not), I would take these random number streams as pure randomness.

stephenll · April 27, 2023, 7:08pm

At my company, we would like reproducibility of the random streams in heterogeneous computing environments for a given seed and the same inputs. Meaning, get the same results on a computer with 2 cores, as you get with 40+ cores. We strive for this, although there are other issues at hand.

Our attempt at this (which has worked so far): 1) each simulation is really fast, so we group the simulations into a bucket of N, where N is set once we get a good feel for what is close to optimal, and leave it alone once in production, 2) Loop and spawn the simulation tasks for each bucket. The random seed is set once at the very beginning of the program before looping over the buckets.

The number of buckets are the same no matter how many cores the computer has, therefore there is the same number of tasks spawn.

lmiq · April 27, 2023, 7:33pm

Since 1.8 when Xoshiro became the default. Previously there where issues with threaded rands, that could be tackled with those “future jump” tricks.

(I just recently had this same question and Sukera and M. Potter clarified this to me - and I ended up using RandomNumbers.jl to use a modern generator and keep compatibility to 1.6 in my package).

Topic		Replies	Views
Best practices for parallel generation of pseudo-random numbers Modelling & Simulations question	11	2146	July 25, 2019
Parallel random number generation Statistics	11	2406	February 10, 2017
Random seeds in parallel computing General Usage parallel	5	1819	September 20, 2020
Dealing with seeds in random numbers New to Julia	4	3980	September 21, 2019
Best practices for using random number generators with threads General Usage question	0	418	December 11, 2019