Randomly generate a sequence of numbers one by one using the rand function in Distributions module

Hi, the random function in the Distributions module is very useful for generating a sequence / vector / list of random numbers one by one based on a given random number generator and a statistical distribution. What I need is to generate a sequence of random numbers (say, several millions) one by one and parse it one by one. I can simply generate it by using the random function call, and save the generated random number sequence into a vector. But this may not seem efficient. I was wondering, is there any other way to handle this? Many thanks.

you could use a generator

n = 1_000_000
rg = (rand() for _ in 1:n)
for r in rg
    #do stuff with random number `r`
end
1 Like

Or just a loop?

for i = 1:n 
    r = rand()
    do_something(r)
end

Thereā€™s nothing wrong with loops in Julia, unlike other languages where you have to avoid them in critical code.

Thank you for kind reply. The situation Iā€™m facing is that, the rand() function in the Distributions module can be optionally supplied with a default first parameter which indicates the random number generator. If I put this rand function with the supplied random number generator, say MersenneTwister(4), then the rand function will give me the ā€œsameā€ number each time, which is not what I need. Moreover, I need more than one rand function calls like this inside of the same loop. Do you know how to accomplish this? I eventually generate the random numbers I need and save them in a vector and then parse it one by one inside of the for loop. I wonder if there is a better way for this.

Iā€™m not sure to understand, but it will be true only if you put the MersenneTwister(4) call within the loop. Just assign the rng to a variable before the loop:

rng = MersenneTwister(4) # or rng = MersenneTwister() for a random seed
for i = 1:n 
    r = rand(rng) # different value at each iteration
    do_something(r)
end

Thank you for your reply. The thing is, I need more than one rand function calls inside of the for loop, and to make it more challenging, for each rand function call, I need to use different random number generatorsā€¦

I donā€™t really understand where is your problemā€¦ what about creating a second rng outside of the loop, and make the rand calls as needed within the loop?

julia> using Random

julia> rng = MersenneTwister(4)

julia> using Distributions

julia> for i = 1:10
           r = rand(rng,Categorical([0.4,0.6]))
           println(r)
       end
2
2
2
2
2
2
1
2
2
2

Itā€™s not what is expected.

What do you expect?

oh. Youā€™re right. I thought the output above is all 2ā€™s, and did not notice that there is a 1 hidden inside.

Iā€™d guess that you might be better off creating that Categorical distribution object outside of the loop.

1 Like

why?

Because itā€™s the same every time and itā€™s pretty hard for the compiler to prove that it doesnā€™t need to create a new array and object referring to that array on every iteration.

julia> using Random
julia> using Distributions
julia> rng1 = MersenneTwister(1)
julia> rng2 = MersenneTwister(2)
julia> for i = 1:10
           r = rand(rng1, Categorical([0.4,0.6]))
           println("r: ", r)
           t = rand(rng2, Exponential(1/200))
           println("t: ", t)
       end
r: 1
t: 0.0030177027481547076
r: 1
t: 0.009348155469112253
r: 1
t: 0.004486075863176537
r: 1
t: 0.007756887782850566
r: 2
t: 0.007259221922800534
r: 1
t: 0.0038109894089538372
r: 2
t: 0.0015894111269976982
r: 2
t: 0.004725440286443651
r: 1
t: 0.006614692669167275
r: 2
t: 0.0026035963682239554

julia> for i = 1:10
           r = rand(MersenneTwister(1), Categorical([0.4,0.6]))
           t = rand(MersenneTwister(4), Exponential(1/200))
           println("r: ", r)
           println("t: ", t)
       end
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384

The above comparison explains my confusing point. I know how it works now. I guess calling the MersenneTwister() function is like re-setting the global random number generator. Or can anyone explain it more clearly?

Each time you call MersenneTwister(4) ā€œis likeā€ re-seeding the global RNG, so doing that in the loop will produce invariably the same output. Moreover, itā€™s very wasteful to create so many new RNGs. By the way, do you really need to create two RNGs for the two distributions? In your example above at least itā€™s unnecessary.

1 Like

to make sure they are strictly not related at allā€¦

Are you sure that makes them more independent? Iā€™m no expert on this, but it sounds to me like the type of ā€˜cleverā€™ overcomplication that could end up making them less independent. You should definitely double check your assumption.

Also, why are you redefining the distributions (to the same values!) on each loop iteration?

1 Like

As a rule of thumb, create independent RNGs for multithreaded code, otherwise use the same one sequentially.

The most important exception to this is CI, but @testset Does the Right Thingā„¢ so you donā€™t need to worry about that either.