The algorithm here: https://arxiv.org/abs/1805.10941
is faster and is claimed to be unbiased.
EDIT: (Skip this. Corrected and improved code is in my following post)
mytrunc(x) = UInt64((x << 64) >> 64)
function nearlydivisionless(s)
x = rand(UInt64)
m = convert(UInt128, x) * s
l = mytrunc(m)
if (l < s)
t = -s % s
while l < t
x = rand(UInt64)
m = convert(UInt128, x) * s
l = mytrunc(m)
end
end
return Int(m >> 64)
end
function biased(s)
return floor(Int, rand() * s)
end
Benchmarks:
julia> @btime nearlydivisionless(10);
7.042 ns (0 allocations: 0 bytes)
julia> @btime biased(10);
5.188 ns (0 allocations: 0 bytes)
julia> @btime rand(0:9);
12.698 ns (0 allocations: 0 bytes)