Random numbers, with MersenneTwister, not stable across major releases

Hi, after updating to Julia 1.11, I have an issue with a random number sequence that changed.

using Random
rng = Random.MersenneTwister(42)
v = rng.rand(10)
println(v)

This gives different results under 1.10.2 and 1.11.1.
I couldn’t find this in the release notes. Isn’t this supposed to be stable across releases?

No, it is not.

If you need (possibly for testing purposes) a stable random number generator, there is StableRNGs.jl

5 Likes

No. See: Random Numbers: Reproducibility in the manual. (NumPy adopted a similar policy.)

Also previous discussions, such as A stable RNG for tests: request for feed-back and How to generate reproducible random numbers across versions via package manifest? - #19 by Sukera

2 Likes

I see. Thanks for clarifying :slight_smile:

This doesn’t work (latter line)… And why are you using MersenneTwister? It’s outdated/slower.

The new default rng is:

julia> using Random

julia> rng = Xoshiro(42);

julia> v = rand(rng)
0.6293451231426089

and it IS (still) reproducible since it was adopted e.g. at least in 1.10 (LTS) and 1.11. It’s not available in the older 1.6 then-LTS, but there’s really no good reason to use it or try to support it with new code.

Note:

help?> rand
  rand([rng=default_rng()], [S], [dims...])

Despite stating that you need to look it up with the implied module prefix:

julia> Random.default_rng()  # This actually means, i.e. is implemented by, Xoshiro, though that could change:
TaskLocalRNG()
help?> TaskLocalRNG
search: TaskLocalRNG

  TaskLocalRNG

  The TaskLocalRNG has state that is local to its task, not its thread. It is seeded upon task creation, from the state of its parent task [..]

  Using or seeding the RNG of any other task than the one returned by current_task() is undefined behavior: it will work most of the time, and may sometimes fail silently.
[..]

  │ Julia 1.10
  │
  │  Task creation no longer advances the parent task's RNG state as of Julia 1.10.

You CAN instantiate Xoshiro yourself, if you don’t it will be seeded randomly and asking for reproducable random numbers seems an oxymoron to me… :slight_smile: But if you do it yourself with a known seed then it will give a defined number (integer) sequence, as with:

julia> rand(UInt64)
0x1db0c85439a70926

I do not expect it [clarification, Xoshiro, not rand for non-int] to ever change. The only reason I see for changing it is if it is found to be flawed (not likely faster will ever be found). Though in you generate floating-point numbers, then inherently you first generate integer then with a different algorithm the float number from it. I don’t know, that might not be optimal and could be changed.

IF Xoshiro is retired, then I suppose a new one, again with a new name will be adopted, it seems not useful to keep that name, but TaskLocalRNG will likely be kept, pointing to a new one, and both the new and old will keep giving you random numbers.

I’m kind of curious WHY the MersenneTwister started giving new integer and floating point number, while Xoshiro kept both reproducable.

Not only can the algorithm change, or simply some small detail like the algorithm for the seeding could change, but also algorithms built on top of it, like randn(), can change.

7 Likes

What do you mean? Yes, I mentioned some exceptions, but if you provide a seed yourself that is used, it doesn’t go through some algorithm? I’m not up-to-speed on SplitMix, if you mean that, but I think it only applies to parallel/threads i.e. TaskLocalRNG but not to just using Xoshiro directly (even with threads, just then race conditions will make it non-reproducible in practice), that was my context.

we hash the seed (so that even if the user seeds with a tiny number like 0, the RNG doesn’t get a seed with almost all 0 states which can be problematic).

I can confirm that happening for MersenneTwister yes, but not actually done for Xoshiro it seems (also explaining why the latter still reproducable, I infer the hashing is changed in 1.11; neither promised to always be the same).

I think you refer to almost 0, or exactly 0, problematic, there, but it could happen even after seeding, exceeding unlikely…

julia> rng = Xoshiro(0) # no warning against doing this, note not the same as Xoshiro():

If no seed is provided, a randomly generated one is created (using entropy from the system). See the seed! function for reseeding an already existing Xoshiro object.

│ Julia 1.11

│ Passing a negative integer seed requires at least Julia 1.11.

I think but not sure, that any integer including 0 is ok for it, still see:

SplitMix64: Recommended for initializing generators of the xoshiro familiy from a 64-bit seed. Used for implementing seed_from_u64.

I note it IS used, by now, with Xoshiro for threads, and I think the warning may only apply to that, generating for many threads.

If you initialize your own, at least for just one thread, directly (most wouldn’t… or then likely not with 0), then I think it’s ok, if not and SplitMix and/or hasing needs to be applied then of course that’s an argument to not have Xoshiro reproducable, i.e. it would have to change because of a minor bug. But it this is actually ok, then I would prefer it just left alone as is.

FYI:
https://www.reddit.com/r/cpp/comments/169xg7z/a_very_general_xoshiroxoroshiro_random_number/

  1. Efficient jump ahead methods are provided that work for arbitrary jump sizes.