Why should I use rand(1:N) for an integer N, instead of Int.(round.(rand()*N?)) On my computer, it seems that Int.(round.(rand()*N?)) is about 3 times faster than rand(1:10).

- Itâ€™s more succinct
- Itâ€™s more obvious in terms of what itâ€™s doing
- I trust Julia to make sure that
`rand(1:N)`

is bias-free. If your shortcut is indeed â€śsafe,â€ť then Iâ€™d imagine the RNG folks would be happy to implement that as the implementation for`rand(1:N)`

, so that performance difference might disappear in the future.

Well, `1 + floor(Int, rand()*N))`

should be bias free and indeed appears to be significantly faster than `rand(1:N)`

. But the fact that OP seriously messed up the expression is a decent anecdotal argument for why people should usually use the simpler formulation anyway.

Beautiful demonstration of the underlying fair uniformly distributed random number generator AND rounding intervals.

Yeah, Int(ceil(N*rand())) is seems unbias and is much faster. My code is performance critical, so I need all the speed boosts I can get.

As long as `N`

is not too big then the bias is very small. Biases become visible for big `N`

, for example:

```
julia> N=10^16; M=10^6; mean(iseven(Int(ceil(N*rand()))) for i in 1:M)
0.549885
julia> N=10^16; M=10^6; mean(iseven(rand(1:N)) for i in 1:M)
0.500379
```

[@bkamins has corrected benchmarks]

You might also have a look at this https://juliasnippets.blogspot.com/2017/11/basics-of-generating-random-numbers-in.html (it is Julia 0.6 based but the reasons for the bias have not changed).

Having said that for very small `N`

in very tight loops I use the approximate formula as the bias is small and it is significantly faster (see the corrected benchmark below - also you have to use `1+floor`

not `ceil`

as `ceil`

can produce `0`

):

```
julia> f1(N=5,M=10^6) = mean(iseven(1+floor(Int, N*rand())) for i in 1:M)
f1 (generic function with 3 methods)
julia> f2(N=5,M=10^6) = mean(iseven(rand(1:N)) for i in 1:M)
f2 (generic function with 3 methods)
julia> @btime f1()
9.020 ms (0 allocations: 0 bytes)
0.400145
julia> @btime f2()
17.681 ms (0 allocations: 0 bytes)
0.399574
```

The algorithm here: https://arxiv.org/abs/1805.10941

is faster and is claimed to be unbiased.

EDIT: (Skip this. Corrected and improved code is in my following post)

```
mytrunc(x) = UInt64((x << 64) >> 64)
function nearlydivisionless(s)
x = rand(UInt64)
m = convert(UInt128, x) * s
l = mytrunc(m)
if (l < s)
t = -s % s
while l < t
x = rand(UInt64)
m = convert(UInt128, x) * s
l = mytrunc(m)
end
end
return Int(m >> 64)
end
function biased(s)
return floor(Int, rand() * s)
end
```

Benchmarks:

```
julia> @btime nearlydivisionless(10);
7.042 ns (0 allocations: 0 bytes)
julia> @btime biased(10);
5.188 ns (0 allocations: 0 bytes)
julia> @btime rand(0:9);
12.698 ns (0 allocations: 0 bytes)
```

This is indeed very good. Additionally we can specialize the function for small `s`

(fitting in `UInt32`

) to gain even more speed in this case.

Yes. In this case the new algorithm appears to be significantly faster than the biased algorithm:

```
@inline function drawl(s::Integer, maxval, T, T2)
x = rand(T)
m = convert(T2, x) * s
l = m % maxval
return (l, m)
end
@inline function _nearlydivisionless(s::Int32)
nbits = 32
__nearlydivisionless(s, UInt32, UInt64, UInt64(1) << nbits, nbits)
end
@inline function _nearlydivisionless(s::Int64)
nbits = 64
__nearlydivisionless(s, UInt64, UInt128, UInt128(1) << nbits, nbits)
end
function nearlydivisionless(s::Integer)
if s < typemax(UInt32)
s1 = Int32(s)
else
s1 = Int64(s)
end
_nearlydivisionless(s1)
end
@inline function __nearlydivisionless(s::Integer, T, T2, maxval, nbits)
l, m = drawl(s, maxval, T, T2)
if (l < s)
t = maxval % s
while l < t
l, m = drawl(s, maxval, T, T2)
end
end
return Int(m >> nbits)
end
biased(s) = floor(Int, rand() * s)
```

Here are the benchmarks:

The builtin function:

```
julia> @btime rand(0:99);
9.358 ns (0 allocations: 0 bytes)
```

The biased algorithm:

```
julia> @btime biased(100);
4.893 ns (0 allocations: 0 bytes)
```

The following function checks that `100 < typemax(UInt32)`

and chooses the routine accordingly. In addtion, itâ€™s a bit faster if the argument is `Int32(100)`

rather than `100`

.

```
julia> @btime nearlydivisionless(Int32(100));
5.209 ns (0 allocations: 0 bytes)
```

Bypassing the check mentioned above is much faster. Much faster than the biased algorithm as well.

```
julia> @btime _nearlydivisionless(Int32(100));
3.698 ns (0 allocations: 0 bytes)
```

These functions appear to work correctly now. The implementation above could no doubt be improved significantly.

If you read the paper, youâ€™ll find that for n = 100, a single division is performed `2e-6`

percent of the time. The other times, no division is performed.

I think choosing the routine based on whether `n < typemax(UInt32)`

is not optimal. As `n`

approaches the maximum value of the type, a single division is performed more often. This may kill the advantage of working with `UInt64`

rather than `UInt128`

.

What is conclusion of the debate? Can be stated in few sentance?

There is general interest in exploring the method from https://arxiv.org/abs/1805.10941 for use in Julia. The following issue keeps track, though not so much has happened for now: https://github.com/JuliaLang/julia/issues/29004

Thank you, that all I need to know.

`*10^16`

might be stretching the float 64 precision levels a bit, 70% integer numbers returned

```
julia> M= 10^6;
julia> x=0; for i in 1:M; global x;
n = 10^16;
n=n*rand();
x=x+ ((n%1)==0.0);
end; print(x);
700058
```

```
julia> 10^16 == Float64(10^16)
true
```

The problem is that `rand(Float64)`

effectively generates random fixed-point numbers and then converts them to floats. See https://discourse.julialang.org/t/output-distribution-of-rand/12192.

Hereâ€™s another funny example

```
julia> 10^16 +1.0 == 10^16 -1.0
true
```