Why should I use rand(1:N) for an integer N, instead of Int.(round.(rand()*N?)) On my computer, it seems that Int.(round.(rand()*N?)) is about 3 times faster than rand(1:10).
- It’s more succinct
- It’s more obvious in terms of what it’s doing
- I trust Julia to make sure that
rand(1:N)is bias-free. If your shortcut is indeed “safe,” then I’d imagine the RNG folks would be happy to implement that as the implementation for
rand(1:N), so that performance difference might disappear in the future.
1 + floor(Int, rand()*N)) should be bias free and indeed appears to be significantly faster than
rand(1:N). But the fact that OP seriously messed up the expression is a decent anecdotal argument for why people should usually use the simpler formulation anyway.
Beautiful demonstration of the underlying fair uniformly distributed random number generator AND rounding intervals.
Yeah, Int(ceil(N*rand())) is seems unbias and is much faster. My code is performance critical, so I need all the speed boosts I can get.
As long as
N is not too big then the bias is very small. Biases become visible for big
N, for example:
julia> N=10^16; M=10^6; mean(iseven(Int(ceil(N*rand()))) for i in 1:M) 0.549885 julia> N=10^16; M=10^6; mean(iseven(rand(1:N)) for i in 1:M) 0.500379
[@bkamins has corrected benchmarks]
You might also have a look at this https://juliasnippets.blogspot.com/2017/11/basics-of-generating-random-numbers-in.html (it is Julia 0.6 based but the reasons for the bias have not changed).
Having said that for very small
N in very tight loops I use the approximate formula as the bias is small and it is significantly faster (see the corrected benchmark below - also you have to use
ceil can produce
julia> f1(N=5,M=10^6) = mean(iseven(1+floor(Int, N*rand())) for i in 1:M) f1 (generic function with 3 methods) julia> f2(N=5,M=10^6) = mean(iseven(rand(1:N)) for i in 1:M) f2 (generic function with 3 methods) julia> @btime f1() 9.020 ms (0 allocations: 0 bytes) 0.400145 julia> @btime f2() 17.681 ms (0 allocations: 0 bytes) 0.399574
The algorithm here: https://arxiv.org/abs/1805.10941
is faster and is claimed to be unbiased.
EDIT: (Skip this. Corrected and improved code is in my following post)
mytrunc(x) = UInt64((x << 64) >> 64) function nearlydivisionless(s) x = rand(UInt64) m = convert(UInt128, x) * s l = mytrunc(m) if (l < s) t = -s % s while l < t x = rand(UInt64) m = convert(UInt128, x) * s l = mytrunc(m) end end return Int(m >> 64) end function biased(s) return floor(Int, rand() * s) end
julia> @btime nearlydivisionless(10); 7.042 ns (0 allocations: 0 bytes) julia> @btime biased(10); 5.188 ns (0 allocations: 0 bytes) julia> @btime rand(0:9); 12.698 ns (0 allocations: 0 bytes)
This is indeed very good. Additionally we can specialize the function for small
s (fitting in
UInt32) to gain even more speed in this case.
Yes. In this case the new algorithm appears to be significantly faster than the biased algorithm:
@inline function drawl(s::Integer, maxval, T, T2) x = rand(T) m = convert(T2, x) * s l = m % maxval return (l, m) end @inline function _nearlydivisionless(s::Int32) nbits = 32 __nearlydivisionless(s, UInt32, UInt64, UInt64(1) << nbits, nbits) end @inline function _nearlydivisionless(s::Int64) nbits = 64 __nearlydivisionless(s, UInt64, UInt128, UInt128(1) << nbits, nbits) end function nearlydivisionless(s::Integer) if s < typemax(UInt32) s1 = Int32(s) else s1 = Int64(s) end _nearlydivisionless(s1) end @inline function __nearlydivisionless(s::Integer, T, T2, maxval, nbits) l, m = drawl(s, maxval, T, T2) if (l < s) t = maxval % s while l < t l, m = drawl(s, maxval, T, T2) end end return Int(m >> nbits) end biased(s) = floor(Int, rand() * s)
Here are the benchmarks:
The builtin function:
julia> @btime rand(0:99); 9.358 ns (0 allocations: 0 bytes)
The biased algorithm:
julia> @btime biased(100); 4.893 ns (0 allocations: 0 bytes)
The following function checks that
100 < typemax(UInt32) and chooses the routine accordingly. In addtion, it’s a bit faster if the argument is
Int32(100) rather than
julia> @btime nearlydivisionless(Int32(100)); 5.209 ns (0 allocations: 0 bytes)
Bypassing the check mentioned above is much faster. Much faster than the biased algorithm as well.
julia> @btime _nearlydivisionless(Int32(100)); 3.698 ns (0 allocations: 0 bytes)
These functions appear to work correctly now. The implementation above could no doubt be improved significantly.
If you read the paper, you’ll find that for n = 100, a single division is performed
2e-6 percent of the time. The other times, no division is performed.
I think choosing the routine based on whether
n < typemax(UInt32) is not optimal. As
n approaches the maximum value of the type, a single division is performed more often. This may kill the advantage of working with
UInt64 rather than
What is conclusion of the debate? Can be stated in few sentance?
There is general interest in exploring the method from https://arxiv.org/abs/1805.10941 for use in Julia. The following issue keeps track, though not so much has happened for now: https://github.com/JuliaLang/julia/issues/29004
Thank you, that all I need to know.
*10^16 might be stretching the float 64 precision levels a bit, 70% integer numbers returned
julia> M= 10^6; julia> x=0; for i in 1:M; global x; n = 10^16; n=n*rand(); x=x+ ((n%1)==0.0); end; print(x); 700058
julia> 10^16 == Float64(10^16) true
The problem is that
rand(Float64) effectively generates random fixed-point numbers and then converts them to floats. See https://discourse.julialang.org/t/output-distribution-of-rand/12192.
Here’s another funny example
julia> 10^16 +1.0 == 10^16 -1.0 true