Random number generation

Why do I get different result when I run the rand command twice with the same seed in the Julia REPL?

using Random
using Distributions

rng = MersenneTwister(1234)

rand(rng) # outputs 0.5908446386657102

#run again
rand(rng) # outputs 0.7667970365022592

Any help appreciated. Thanks!

rand mutates the rng.

julia> rng = MersenneTwister(1234)
MersenneTwister(1234)

julia> rand(rng)
0.5908446386657102

julia> rng
MersenneTwister(1234, (0, 1002, 0, 1))

P.S. you probably want to use Xoshiro which is typically better than MersenneTwister

6 Likes

Thanks @Oscar_Smith. That explains the behavior. So to ensure the same output, I guess the best way is to make sure the rng is redefined before every call to rand. Is that right? So maybe putting it inside a function is better. Just trying to know how people go about seeding their codes.

Also, never heard of Xoshiro. Thanks for pointing that out. Is there any particular reason why it’s better than MersenneTwister apart from what’s mentioned in the documentation [Apart from the high speed, Xoshiro has a small memory footprint, making it suitable for applications where many different random states need to be held for long time.]?

you can also seed! an existing rng, e.g. Random.seed!(rng, 1234)

2 Likes

Not sure I understand completely. I still get different outputs via this approach. Do you mind providing a simple example on how to use it? Thanks!

AFAIK the Xoshiro family has much better statistical properties than the Mersenne Twister. You can find some relevant papers with Google Scholar, but the non-cryptographic pseudorandom generators are mostly evaluated via test suites like Practrand.

If you want every call to rand to give the same result, why are you calling rand multiple times? Why not just set a_random_x = rand() once and then use a_random_x as needed?

8 Likes
julia> using Random

julia> g = Xoshiro(123);

julia> rand(g)
0.521213795535383

julia> Random.seed!(g, 123)
Xoshiro(0xfefa8d41b8f5dca5, 0xf80cc98e147960c1, 0x20e2ccc17662fc1d, 0xea7a7dcb2e787c01, 0xf4e85a418b9c4f80)

julia> rand(g)
0.521213795535383
1 Like

Think a random number generator (RNG) to something that when you seed (or first create) provides a fixed, deterministic sequence of random numbers. Everytime you then call rand you pop out a single element from that sequence.

I wrote “deterministic” sequence because if you seed or construct the RNG with a given number, than the sequence that the RNG provides is always the same.

5 Likes


On a more serious note: it is a slight inconsistency that rand(rng) mutates the input but has no !… I can see why though.

8 Likes

It’s not inconsistent and it’s documented. The ! is used to warn the caller that a function modifies the contents of a data structure. It is not used for all modification of any state. You do not need a warning that readline(io) changes the state of the io argument—of course it does, that’s the only way for I/O operations to work. You also don’t need a warning that rand(rng) changes the state of the rng for much the same reason. By comparison there are many reasonable functions that take arrays and don’t mutate them. So when some function does mutate an array, you want a warning, and that’s what the ! is for.

4 Likes

The current API isn’t bad, but I do kinda think a strict rule “mutate any argument, get a !” could be easier to learn and remember, e.g. read!(io) and readinto!(io, out), just because “always” is a simpler mnemonic than “unless it’s obvious”.

1 Like

A random number generator has no official (documented) internal state, hence nothing is being mutated, officially. As far as the caller is concerned, the RNG might as well be measuring some cosmic radiation and simply return those bits.

5 Likes

That would make it impossible to express that some IO operations mutate their array argument.

1 Like

Right, and similarly IO objects in principle just interact with the outside world. Of course there are buffers and other private state, but the primary state that changes is the outside world, which is entirely different from mutating a program-visible data structure. This is exactly the difference: ! is about warning that a function modifies a publicly visible data structure like an array or a dictionary. The state of I/O streams or RNGs is not that—it’s implicit internal state.

3 Likes

Great, this is the core of the argument. I favor a somewhat larger conception of “publicly visible” than that.

Specifically: my inclination is that a function without ! would modify internal state of its argument only if that state is not externally visible, such as a private cache that helps with the structure’s performance. Any function that has an effect on the argument visible from public APIs would have !. For example in

julia> let io = IOBuffer("hello")
           read(io), read(io)
       end
(UInt8[0x68, 0x65, 0x6c, 0x6c, 0x6f], UInt8[])

this read is (1) not :consistent because (2) it mutates its argument, so it would have !.

By contrast, if it didn’t take the io as an argument and just (2) mutated some hidden global state, or if it (1) reset the publicly readable state of its argument afterwards so that subsequent reads were consistent, then it would just be read().

The “official” interpretation is that read(io) doesn’t mutate io. It’s the state of the world that changed, not the io object.

I favor a somewhat larger conception of “publicly visible” than that.

That’s fine, but the Julia designers had to draw the line somwhere on what is explicit state and what is implicit state. There’s always going to be examples where one could argue whether that line might have been better moved a bit in one direction or the other. In some sense, every function mutates, since it changes the content of your RAM or at least the registers in the CPU.

Functional languages like Haskell tend to draw the line for what is explicit state a lot closer and probably would support that read(io) should be considered mutating. But fundamentally, it’s a language design decision, and Julia is not at that stage of development anymore. So the semantics or name of read isn’t going to change.

Personally, I find the decisions that were made for Julia quite sensible, but it’s hard to argue if your intuition is different (maybe because you’re coming from a more functional language background?)

3 Likes

The internal state of memory is not public API, nor would be any internal cache or memory management structure. RAM and CPU state are of course not public either. The state of such things is not guaranteed by any public interface, so whether they are mutated is immaterial to the API design.

By contrast, the bytes returned by a buffer read are part of its public API: usage of read counts on it returning different bytes in successive reads. I don’t see what distinction is being implied between the state of a buffer and the state of an array, since both are mutated in such a way that they fail to return the same value by public APIs invoked twice.

julia> let xs = [1,2]
           pop!(xs), pop!(xs)
       end
(2, 1)

If an IOBuffer was the primary object that read was designed to act on, I’d be inclined to agree with you. But more commonly, io is a file handle or stream like stdin, and then the situation is much less clear-cut. And for other methods of read like read(filename) or read(command), thinking of read as mutating its argument wouldn’t make any sense at all.

2 Likes

For names, no. I have found I disagree with basically every decision the core team has made about naming in general (I blame the Matlab influence). The rest of the language is great, but every other month I think about starting a project that could somehow shadow all the base names and replace them by sensible alternatives for people that agree with me to use.