# Same random seed, but different random numbers?

Dear all,

I try to generate random number using both `randn()` and `rand()`, I set the same random seed, but why they generate different results?

``````julia> using Distributions, Random

julia> Random.seed!(123)

julia> a = rand(Normal(0, 1), 100);

julia> Random.seed!(123)

julia> b = randn(100);

julia> a == b
false

``````
1 Like

It works for vectors up to 7 elements long for some reason.

``````julia> begin
N = 7
Random.seed!(123)
a = rand(Normal(0, 1), N)
Random.seed!(123)
b = randn(N)
a == b
end
true
``````

Yes. Itâ€™s strange. When N>7, the results are different.

It comes from 7 being hardcoded as a specialization here

``````function \$randfun!(rng::Union{Xoshiro, TaskLocalRNG}, A::Array{Float64})
if length(A) < 7
for i in eachindex(A)
@inbounds A[i] = \$randfun(rng, Float64)
end
else
GC.@preserve A rand!(rng, UnsafeView{UInt64}(pointer(A), length(A)))

for i in eachindex(A)
@inbounds A[i] = \$_randfun(rng, reinterpret(UInt64, A[i]) >>> 12)
end
end
A
end
``````

though Iâ€™m not sure why `randn(N)` calls this while `rand(Normal(0, 1), N)` uses this

``````function _rand!(rng::AbstractRNG, sampler::Sampleable{Univariate}, A::AbstractArray{<:Real})
for i in eachindex(A)
@inbounds A[i] = rand(rng, sampler)
end
return A
end
``````

But here we see that for `N<7` the implementations seems similar with one element being generated at a time, and if you actually generate a 100 numbers one at a time with the two strategies they are the same, while for `N>7` `randn` will generate stuff in blocks with some other strategy that results in different numbers.

2 Likes

I hope that in Julia 1.8 release version, this problem can be solved.

I am not sure this is a â€śproblemâ€ť.
`rand()` and `rand(Normal(0, 1)` are not guaranteed (indeedâ€¦) to use the same algorithm, so why should we assume they provide the same random streams ? It would be a problem if any of the two provides a different random stream with the same seedâ€¦

4 Likes

But I think `randn()` and `rand(Normal(0, 1))` should get the same random numbers with the same seed.
in Julia, both `randn()` and `rand(Normal(0, 1))` generate normally-distributed random number of with mean 0 and standard deviation 1.

When we generate one random number with the same seed, we find that they always get the same random number.

``````julia> Random.seed!(123)

julia> randn()
-0.6457306721039767

julia> Random.seed!(123)

julia> rand(Normal(0, 1))
-0.6457306721039767
``````

So I think they should also get the same random numbers when N > 7.

I donâ€™t feel the need to have the assumption that they do generate them in the same way (hence with the same random stream). After all, they are different functions.
It happens that for N<=7 they do produce the same numbers/consume the same random stream from the RNG, but nothing guarantees that this is a general rule (indeed it isnâ€™t).
We shouldnâ€™t ever induce a general rule from a few observations, like in the Bertrand Russellâ€™s methaphor.

8 Likes

Think about it this way. `randn(N)` and `rand(Normal(0, 1), N)` are 2 different ways of sampling a standard normal variable; if you take enough samples (large N) from either, you see a normal distribution. They donâ€™t promise to or need to do it in the same order or the same way to accomplish that.

Setting the random seed is intended for reproducing the pseudorandom sampling. If youâ€™re trying to reproduce any behavior, you should run the exact same code. Never substitute any step, even if youâ€™re sure they are equivalent.

6 Likes

By the same logic, you should be even more discontent with

``````julia> using Random

julia> Random.seed!(123)

julia> a = randn(7);

julia> Random.seed!(123)

julia> b = randn(8)[1:7];

julia> a == b
false
``````

But as others have said, you cannot reasonably expect the same random stream if you are not running the exact same code. The fact that a specialization exists for `N>7` (or `N>13` for Mersenne Twisters) is a good sign. It means someone has given serious thought to maximizing performance, which is next to correctness the most important goal.

If any implementation is â€śat faultâ€ť, itâ€™s `Distributions`â€™, because their sampler circumvents the specialization in `Random`.

In any case, I cannot imagine a scenario in which requiring `randn == rand(Normal)` is sensible.

5 Likes