Reproducibility of multiple calls to rand

Hello, new Julia (version 1.4.1) user here, trying to understand the reproducibility of random number sequences. Below is an example consisting of two tests:

using Random

blksize = 1000

#=
Test 1: Generate a sequence of 3000 random numbers in three blocks of
1000 and output the first number in every block, which should correspond
to elements 1, 1001, 2001 in the sequence.
=#
println("Test 1: Generate random numbers in blocks")

rng = MersenneTwister(1234)
blk1 = rand(rng, blksize)
println("blk1: ", blk1[1])
blk2 = rand(rng, blksize)
println("blk2: ", blk2[1])
blk3 = rand(rng, blksize)
println("blk3: ", blk3[1])

println("")

#=
Test 2: Generate all 3000 random numbers in a single call, and output elements 1,
1001, 2001 in the sequence.
=#
println("Test 2: Generate random numbers in a single big block")

rng = MersenneTwister(1234)
bigblk = rand(rng, 3*blksize)
println("blk1: ", bigblk[1])
println("blk2: ", bigblk[blksize + 1])
println("blk3: ", bigblk[2*blksize + 1])

Both sequences are generated from the same initial seed so I would expect the same results. But that’s not what I see:

Test 1: Generate random numbers in blocks
blk1: 0.5908446386657102
blk2: 0.0987939295091631
blk3: 0.2690250724723906

Test 2: Generate random numbers in a single big block
blk1: 0.5908446386657102
blk2: 0.9540122269417939
blk3: 0.0987939295091631

I am especially puzzled why the first element of blk2 in the first test would appear as the first element of blk3 in the second test. Also, I only see this behavior for block sizes >400 or so, for smaller blocks I get the same results for both tests.

Any advice would be very much appreciated. Thanks!

2 Likes

Is it

julia> using Random

julia> Random.dsfmt_get_min_array_size()
382

?

Basically, the random number generator can only produce large numbers of random numbers at a time.
So, for smaller numbers of random numbers, it fills a buffer and then can index into that one at a time.
For larger numbers, it can fill arrays directly.

3 Likes

The following line seems to be the reason why rand(rng,n) with n <= 383 is handled differently from n >= 384:

I am not sure whether Julia is supposed to guarantee that [rand(n1); rand(n2)] is the same as rand(n1+n2). If so, your findings would indicate a bug.

1 Like

Yes, precisely.

I don’t think there is any such guarantee.

5 Likes

“By using an RNG parameter initialized with a given seed, you can reproduce the same pseudorandom number sequence when running your program multiple times.” - doesn’t talk about how you read the sequence.

I see…that is unfortunate, it would make testing a little easier if a given seed could yield a reproducible sequence regardless of how rand is called. Thanks everyone for your help!

You may be interested in StableRNGs.jl — it’s a simple RNG geared for usage in tests (as the builtin RNG may change/improve in future versions of Julia), but it’s also a really simple and straightforward RNG that I think will happen to have the perfectly-sequential property you’re after.

5 Likes

Yeah, exactly. So it is then not guaranteed.

1 Like

Yes it’s indeed “guaranteed” :slight_smile:

This is exactly what I need…thanks!