# rand(1:10) vs Int(round(10*rand())

Why should I use rand(1:N) for an integer N, instead of Int.(round.(rand()*N?)) On my computer, it seems that Int.(round.(rand()*N?)) is about 3 times faster than rand(1:10).

1 Like
• Itâ€™s more succinct
• Itâ€™s more obvious in terms of what itâ€™s doing
• I trust Julia to make sure that rand(1:N) is bias-free. If your shortcut is indeed â€śsafe,â€ť then Iâ€™d imagine the RNG folks would be happy to implement that as the implementation for rand(1:N), so that performance difference might disappear in the future.
3 Likes

To that point, it turns out that Int(round(rand() * N)) is definitely not bias-free:

7 Likes

Well, 1 + floor(Int, rand()*N)) should be bias free and indeed appears to be significantly faster than rand(1:N). But the fact that OP seriously messed up the expression is a decent anecdotal argument for why people should usually use the simpler formulation anyway.

4 Likes

Beautiful demonstration of the underlying fair uniformly distributed random number generator AND rounding intervals.

Yeah, Int(ceil(N*rand())) is seems unbias and is much faster. My code is performance critical, so I need all the speed boosts I can get.

As long as N is not too big then the bias is very small. Biases become visible for big N, for example:

julia> N=10^16; M=10^6; mean(iseven(Int(ceil(N*rand()))) for i in 1:M)
0.549885

julia> N=10^16; M=10^6; mean(iseven(rand(1:N)) for i in 1:M)
0.500379

[@bkamins has corrected benchmarks]

3 Likes

You might also have a look at this https://juliasnippets.blogspot.com/2017/11/basics-of-generating-random-numbers-in.html (it is Julia 0.6 based but the reasons for the bias have not changed).

Having said that for very small N in very tight loops I use the approximate formula as the bias is small and it is significantly faster (see the corrected benchmark below - also you have to use 1+floor not ceil as ceil can produce 0):

julia> f1(N=5,M=10^6) = mean(iseven(1+floor(Int, N*rand())) for i in 1:M)
f1 (generic function with 3 methods)

julia> f2(N=5,M=10^6) = mean(iseven(rand(1:N)) for i in 1:M)
f2 (generic function with 3 methods)

julia> @btime f1()
9.020 ms (0 allocations: 0 bytes)
0.400145

julia> @btime f2()
17.681 ms (0 allocations: 0 bytes)
0.399574
4 Likes

The algorithm here: https://arxiv.org/abs/1805.10941
is faster and is claimed to be unbiased.

EDIT: (Skip this. Corrected and improved code is in my following post)

mytrunc(x) = UInt64((x << 64) >> 64)
function nearlydivisionless(s)
x = rand(UInt64)
m = convert(UInt128, x) * s
l = mytrunc(m)
if (l < s)
t = -s % s
while l < t
x = rand(UInt64)
m = convert(UInt128, x) * s
l = mytrunc(m)
end
end
return Int(m >> 64)
end

function biased(s)
return floor(Int, rand() * s)
end

Benchmarks:

julia> @btime nearlydivisionless(10);
7.042 ns (0 allocations: 0 bytes)

julia> @btime biased(10);
5.188 ns (0 allocations: 0 bytes)

julia> @btime rand(0:9);
12.698 ns (0 allocations: 0 bytes)
5 Likes

This is indeed very good. Additionally we can specialize the function for small s (fitting in UInt32) to gain even more speed in this case.

1 Like

Yes. In this case the new algorithm appears to be significantly faster than the biased algorithm:

@inline function drawl(s::Integer, maxval, T, T2)
x = rand(T)
m = convert(T2, x) * s
l = m % maxval
return (l, m)
end

@inline function _nearlydivisionless(s::Int32)
nbits = 32
__nearlydivisionless(s, UInt32, UInt64, UInt64(1) << nbits, nbits)
end

@inline function _nearlydivisionless(s::Int64)
nbits = 64
__nearlydivisionless(s, UInt64, UInt128, UInt128(1) << nbits, nbits)
end

function nearlydivisionless(s::Integer)
if s < typemax(UInt32)
s1 = Int32(s)
else
s1 = Int64(s)
end
_nearlydivisionless(s1)
end

@inline function __nearlydivisionless(s::Integer, T, T2, maxval, nbits)
l, m = drawl(s, maxval, T, T2)
if (l < s)
t = maxval % s
while l < t
l, m = drawl(s, maxval, T, T2)
end
end
return Int(m >> nbits)
end

biased(s) = floor(Int, rand() * s)

Here are the benchmarks:

The builtin function:

julia> @btime rand(0:99);
9.358 ns (0 allocations: 0 bytes)

The biased algorithm:

julia> @btime biased(100);
4.893 ns (0 allocations: 0 bytes)

The following function checks that 100 < typemax(UInt32) and chooses the routine accordingly. In addtion, itâ€™s a bit faster if the argument is Int32(100) rather than 100.

julia> @btime nearlydivisionless(Int32(100));
5.209 ns (0 allocations: 0 bytes)

Bypassing the check mentioned above is much faster. Much faster than the biased algorithm as well.

julia> @btime _nearlydivisionless(Int32(100));
3.698 ns (0 allocations: 0 bytes)

These functions appear to work correctly now. The implementation above could no doubt be improved significantly.

If you read the paper, youâ€™ll find that for n = 100, a single division is performed 2e-6 percent of the time. The other times, no division is performed.

I think choosing the routine based on whether n < typemax(UInt32) is not optimal. As n approaches the maximum value of the type, a single division is performed more often. This may kill the advantage of working with UInt64 rather than UInt128.

5 Likes

What is conclusion of the debate? Can be stated in few sentance?

There is general interest in exploring the method from https://arxiv.org/abs/1805.10941 for use in Julia. The following issue keeps track, though not so much has happened for now: https://github.com/JuliaLang/julia/issues/29004

3 Likes

Thank you, that all I need to know.

*10^16 might be stretching the float 64 precision levels a bit, 70% integer numbers returned

julia> M= 10^6;
julia> x=0; for i in 1:M; global x;
n = 10^16;
n=n*rand();
x=x+ ((n%1)==0.0);
end; print(x);
700058
julia> 10^16 == Float64(10^16)
true

The problem is that rand(Float64) effectively generates random fixed-point numbers and then converts them to floats. See https://discourse.julialang.org/t/output-distribution-of-rand/12192.

Hereâ€™s another funny example

julia> 10^16 +1.0 == 10^16 -1.0
true