Randomly generate a sequence of numbers one by one using the rand function in Distributions module

#1

Hi, the random function in the Distributions module is very useful for generating a sequence / vector / list of random numbers one by one based on a given random number generator and a statistical distribution. What I need is to generate a sequence of random numbers (say, several millions) one by one and parse it one by one. I can simply generate it by using the random function call, and save the generated random number sequence into a vector. But this may not seem efficient. I was wondering, is there any other way to handle this? Many thanks.

#2

you could use a generator

n = 1_000_000
rg = (rand() for _ in 1:n)
for r in rg
    #do stuff with random number `r`
end
1 Like
#3

Or just a loop?

for i = 1:n 
    r = rand()
    do_something(r)
end

There’s nothing wrong with loops in Julia, unlike other languages where you have to avoid them in critical code.

#4

Thank you for kind reply. The situation I’m facing is that, the rand() function in the Distributions module can be optionally supplied with a default first parameter which indicates the random number generator. If I put this rand function with the supplied random number generator, say MersenneTwister(4), then the rand function will give me the “same” number each time, which is not what I need. Moreover, I need more than one rand function calls like this inside of the same loop. Do you know how to accomplish this? I eventually generate the random numbers I need and save them in a vector and then parse it one by one inside of the for loop. I wonder if there is a better way for this.

#5

I’m not sure to understand, but it will be true only if you put the MersenneTwister(4) call within the loop. Just assign the rng to a variable before the loop:

rng = MersenneTwister(4) # or rng = MersenneTwister() for a random seed
for i = 1:n 
    r = rand(rng) # different value at each iteration
    do_something(r)
end
#6

Thank you for your reply. The thing is, I need more than one rand function calls inside of the for loop, and to make it more challenging, for each rand function call, I need to use different random number generators…

#7

I don’t really understand where is your problem… what about creating a second rng outside of the loop, and make the rand calls as needed within the loop?

#8
julia> using Random

julia> rng = MersenneTwister(4)

julia> using Distributions

julia> for i = 1:10
           r = rand(rng,Categorical([0.4,0.6]))
           println(r)
       end
2
2
2
2
2
2
1
2
2
2

It’s not what is expected.

#9

What do you expect?

#10

oh. You’re right. I thought the output above is all 2’s, and did not notice that there is a 1 hidden inside.

#11

I’d guess that you might be better off creating that Categorical distribution object outside of the loop.

1 Like
#12

why?

#13

Because it’s the same every time and it’s pretty hard for the compiler to prove that it doesn’t need to create a new array and object referring to that array on every iteration.

#14
julia> using Random
julia> using Distributions
julia> rng1 = MersenneTwister(1)
julia> rng2 = MersenneTwister(2)
julia> for i = 1:10
           r = rand(rng1, Categorical([0.4,0.6]))
           println("r: ", r)
           t = rand(rng2, Exponential(1/200))
           println("t: ", t)
       end
r: 1
t: 0.0030177027481547076
r: 1
t: 0.009348155469112253
r: 1
t: 0.004486075863176537
r: 1
t: 0.007756887782850566
r: 2
t: 0.007259221922800534
r: 1
t: 0.0038109894089538372
r: 2
t: 0.0015894111269976982
r: 2
t: 0.004725440286443651
r: 1
t: 0.006614692669167275
r: 2
t: 0.0026035963682239554

julia> for i = 1:10
           r = rand(MersenneTwister(1), Categorical([0.4,0.6]))
           t = rand(MersenneTwister(4), Exponential(1/200))
           println("r: ", r)
           println("t: ", t)
       end
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384
r: 1
t: 0.011176055504085384

The above comparison explains my confusing point. I know how it works now. I guess calling the MersenneTwister() function is like re-setting the global random number generator. Or can anyone explain it more clearly?

#15

Each time you call MersenneTwister(4) “is like” re-seeding the global RNG, so doing that in the loop will produce invariably the same output. Moreover, it’s very wasteful to create so many new RNGs. By the way, do you really need to create two RNGs for the two distributions? In your example above at least it’s unnecessary.

1 Like
#16

to make sure they are strictly not related at all…

#18

Are you sure that makes them more independent? I’m no expert on this, but it sounds to me like the type of ‘clever’ overcomplication that could end up making them less independent. You should definitely double check your assumption.

Also, why are you redefining the distributions (to the same values!) on each loop iteration?

1 Like
#19

As a rule of thumb, create independent RNGs for multithreaded code, otherwise use the same one sequentially.

The most important exception to this is CI, but @testset Does the Right Thing™ so you don’t need to worry about that either.