Is it really possible for rand to throw a zero?

aerdely · September 7, 2019, 4:29pm

rand documentation says it draws iid uniformly distributed values in the half-open unit interval [0,1) which includes zero but… Is rand ever throwing a zero? I have tried several times with minimum(rand(1_000_000_000)) and it is always strictly greater than zero.

anon92994695 · September 7, 2019, 4:40pm

Think about how rare this would be from a probabalistic perspective. All decimal combinations are possible down to machine precision? Without looking at the code I would say this is an extremely rare thing to look for!

aerdely · September 7, 2019, 4:43pm

That is exactly why I think it is pointless to say that rand draws iid numbers in the half-open unit interval [0,1) including zero, I was expecting just the open interval (0,1) as usual, like in some other programming languages.

Oscar_Smith · September 7, 2019, 4:45pm

Specifically, there are 2^53 floating point numbers in [0,1), so you would expect to draw ~10^15 before you expect a 0. That said, since 0 is a possible output, it matters a lot, since you should for example not write 1/rand(), since that could occasionally get a divide by zero error.

aerdely · September 7, 2019, 4:49pm

Being that the case, that is what surprises me: why including zero? And so, why not including 1?

Tamas_Papp · September 7, 2019, 4:53pm

See this discussion if you want the 1 instead:

As for the reason: various conventions exist, this is a common one. It is somewhat convenient: the actual value is generated in [1, 2), then adjusted. You may be interested in the source, eg

github.com

JuliaLang/julia/blob/a3ccac00a064cb20720927668eee8d3a44124364/stdlib/Random/src/generation.jl#L35


      
          rand(r::AbstractRNG, ::SamplerTrivial{CloseOpen01{Float16}}) =
              Float16(reinterpret(Float32,
                                  (rand(r, UInt10(UInt32)) << 13)  | 0x3f800000) - 1)
          
          rand(r::AbstractRNG, ::SamplerTrivial{CloseOpen01{Float32}}) =
              reinterpret(Float32, rand(r, UInt23()) | 0x3f800000) - 1
          
          rand(r::AbstractRNG, ::SamplerTrivial{CloseOpen12_64}) =
              reinterpret(Float64, 0x3ff0000000000000 | rand(r, UInt52()))
          
          rand(r::AbstractRNG, ::SamplerTrivial{CloseOpen01_64}) = rand(r, CloseOpen12()) - 1.0
          
          #### BigFloat
          
          const bits_in_Limb = sizeof(Limb) << 3
          const Limb_high_bit = one(Limb) << (bits_in_Limb-1)
          
          struct SamplerBigFloat{I<:FloatInterval{BigFloat}} <: Sampler{BigFloat}
              prec::Int
              nlimbs::Int
              limbs::Vector{Limb}

Christopher_Fisher · September 7, 2019, 4:55pm

If my math is correct, and I didn’t introduce float point imprecision, I think the probability of drawing at least one zero even in 100 million draws is fairly small.

N = 1_000_000_000   #Number of random draws
logp = log(1-eps()) #Log probability not drawing zero on a given trial
t = logp*N #Log probability of not drawing zero in N trials
p = 1-exp(t) #The probability of drawing at least 1 zero in N trials
julia> p
2.2204458027808016e-7

Someone else who is better at this might chime in.

aerdely · September 7, 2019, 4:57pm

Well, this seems like a disadvantage for probabilistic simulation since it needs iid uniform values on the open unit interval (0,1), that is excluding both 0 and 1. What would be the fastest way to eliminate the possibility of a zero with rand?

Tamas_Papp · September 7, 2019, 5:04pm

Don’t know about fastest, but this is a quick workaround:

function rand_interior(rng, T)
    while true
        z = rand(rng, T)
        iszero(z) || return z
    end
end

Note that this of course depends on the kind of simulation you are doing: some algorithms are totally fine with 0, some are not, some would want (0, 1] instead for theoretical correctness (may not matter in practice). It is impossible to design for all use cases, but as long as the specs are clear you should be able to obtain the desired kind of output.

robsmith11 · September 7, 2019, 5:09pm

Keep in mind that if you’re using Float64s, then the probability of rand() producing an exact zero is likely far less than other important bits getting flilpped in non-ECC memory…

StevenSiew · September 7, 2019, 11:02pm

I can assure you that zero is possible

for a decimal floating point system (as oppose to a binary floating point system)

    function PDFP_randomPDFPbetweenZeroandOne()
        DEFAULT_PRECISION[] > 0 ? newmantissa = rand(0:9,DEFAULT_PRECISION[]) : newmantissa = []
        # if the first digit is zero
        if DEFAULT_PRECISION[] > 0 && newmantissa[1]==0
            if sum(  map(Int64,newmantissa)  ) == 0
                # if all the digits are zero then
                # return zero
                return PDFP(0,0,newmantissa)
            else
                # Otherwise normalise the number
                return PDFP_normalise( PDFP(0,-1,newmantissa) )
            end
        end
        # Othersie return the number with the exponent set to -1
        PDFP(0,-1,newmantissa)
    end

Here is an example for a one digit mantissa decimal floating point system

julia> using PDFPs

julia> PDFP_setDefaultPrecision(1)
1

julia> a = [ PDFP_randomPDFPbetweenZeroandOne() for x in 1:20 ]
20-element Array{PDFP,1}:
 PDFP(0, -1, [3])
 PDFP(0, -1, [8])
 PDFP(0, -1, [2])
 PDFP(0, -1, [5])
 PDFP(0, -1, [3])
 PDFP(0, -1, [8])
 PDFP(0, -1, [5])
 PDFP(0, -1, [7])
 PDFP(0, -1, [9])
 PDFP(0, -1, [8])
  PDFP(0, 0, [0])
 PDFP(0, -1, [3])
 PDFP(0, -1, [8])
 PDFP(0, -1, [2])
 PDFP(0, -1, [5])
 PDFP(0, -1, [1])
 PDFP(0, -1, [6])
 PDFP(0, -1, [6])
 PDFP(0, -1, [9])
 PDFP(0, -1, [6])

julia> map(PDFP_toCommonString,a)
20-element Array{String,1}:
 "0.3"
 "0.8"
 "0.2"
 "0.5"
 "0.3"
 "0.8"
 "0.5"
 "0.7"
 "0.9"
 "0.8"
 "0"  
 "0.3"
 "0.8"
 "0.2"
 "0.5"
 "0.1"
 "0.6"
 "0.6"
 "0.9"
 "0.6"

Once you get to multiple digits mantissa, it is less likely to happen

julia> using PDFPs

julia> PDFP_setDefaultPrecision(4)
4

julia> a = [ PDFP_randomPDFPbetweenZeroandOne() for x in 1:20 ]
20-element Array{PDFP,1}:
 PDFP(0, -1, [2, 3, 3, 9])
 PDFP(0, -1, [6, 0, 9, 4])
 PDFP(0, -1, [7, 6, 3, 6])
 PDFP(0, -1, [9, 0, 7, 5])
 PDFP(0, -1, [1, 5, 9, 4])
 PDFP(0, -1, [2, 7, 2, 7])
 PDFP(0, -1, [7, 2, 0, 7])
 PDFP(0, -1, [5, 5, 9, 8])
 PDFP(0, -1, [2, 5, 3, 1])
 PDFP(0, -1, [1, 9, 0, 6])
 PDFP(0, -1, [1, 7, 6, 0])
 PDFP(0, -1, [5, 8, 2, 7])
 PDFP(0, -1, [1, 6, 6, 8])
 PDFP(0, -1, [2, 3, 1, 3])
 PDFP(0, -2, [9, 3, 7, 0])
 PDFP(0, -1, [6, 7, 6, 4])
 PDFP(0, -1, [3, 9, 6, 5])
 PDFP(0, -1, [6, 6, 3, 5])
 PDFP(0, -1, [3, 0, 2, 6])
 PDFP(0, -1, [8, 5, 7, 8])

julia> map(PDFP_toCommonString,a)
20-element Array{String,1}:
 "0.2339" 
 "0.6094" 
 "0.7636" 
 "0.9075" 
 "0.1594" 
 "0.2727" 
 "0.7207" 
 "0.5598" 
 "0.2531" 
 "0.1906" 
 "0.1760" 
 "0.5827" 
 "0.1668" 
 "0.2313" 
 "0.09370"
 "0.6764" 
 "0.3965" 
 "0.6635" 
 "0.3026" 
 "0.8578"

Float64 binary floating point system has an equivalent of 15.8 decimal mantissa digits. You have more chances of winning powerball with me than you have of coming up with a zero.

StefanKarpinski · September 8, 2019, 6:27am

Sampling from the range [0,1) is pretty standard. What languages do something else?

StefanKarpinski · September 8, 2019, 6:29am

This is true but these powerball tickets are very cheap. If you’re writing robust code you do need to account for the possibility of getting zero.

JeffreySarnoff · September 8, 2019, 5:55pm

if you prefer no zeros and no ones
(and, for simplicity, only want Float64s)

function openrand()
     result = rand()
     while iszero(result)
         result = rand()
     end
     return result
end

aerdely · September 8, 2019, 8:16pm

In R runif(n, 0, 1) will not generate either of the extreme values 0 or 1 (according to R’s documentation), and from a probabilistic point of view, it is better this way for use with the quantile function in probabilistic simulation.

Elrod · September 8, 2019, 8:55pm

Most RNGs return numbers between*

julia> @inline mask(x, ::Type{Float64}) = reinterpret(Float64,(x & 0x000fffffffffffff) | 0x3ff0000000000000)
mask (generic function with 1 method)

julia> mask(typemin(UInt64),Float64) - 1.0
0.0

julia> mask(typemax(UInt64),Float64) - 1.0
0.9999999999999998

Instead of calculating u - 1, where u is uniformly distributed in [1.0,2.0), we could calculate

julia> muladd(mask(typemin(UInt64),Float64),nextfloat(-1.0),prevfloat(2.0))
0.9999999999999999

julia> muladd(mask(typemax(UInt64),Float64),nextfloat(-1.0),prevfloat(2.0))
2.2204460492503128e-16

?

Other than dropping zero, we’re not really losing anything from the bottom end of the distribution.

julia> mask(one(UInt64),Float64) - 1.0
2.220446049250313e-16

while the maximum value returned is now prevfloat(1.0) instead of prevfloat(1.0,2).

*For clarification, this is because those RNGs generate sequences of bits, ie, rand(UInt64). So the following will work (after using Random and defining the mask function):

myrng(rng = Random.GLOBAL_RNG) = muladd(mask(rand(rng,UInt64),Float64),nextfloat(-1.0),prevfloat(2.0))

From this PR:
https://github.com/JuliaLang/julia/pull/33251
All we actually need to do is use prevfloat(one(T)) instead of one(T).

julia> @inline mask(x, ::Type{Float64}) = reinterpret(Float64,(x & 0x000fffffffffffff) | 0x3ff0000000000000)
mask (generic function with 1 method)

julia> mask(typemin(UInt64), Float64) - prevfloat(one(Float64))
1.1102230246251565e-16

julia> mask(typemax(UInt64), Float64) - prevfloat(one(Float64))
0.9999999999999999

Now we would have samples from (0,1).

StefanKarpinski · September 8, 2019, 9:15pm

To answer my own question, Julia, C, C++, Fortran, Python, Perl and Ruby sample from [0,1):

Matlab and R sample from (0,1):

Mathematica samples from [0,1]:

Mathematical Functions—Wolfram Language Documentation

In any case, Julia is not in bad company here, and arguably sampling from [0,1) is the closest to a standard behavior in this area.

bkamins · September 8, 2019, 9:43pm

Just to add to the list by @StefanKarpinski, Excel (and also other standard spreadsheets like Google Sheets) is also in Julia’s camp.

So if we judged by the number of people in the world that routinely use one of these definitions I would bet that [0,1) beats alternatives as a “standard” by a huge margin.

(of course I do not try to impute here that popularity should be the major factor in deciding what is best from the scientific perspective)

Tamas_Papp · September 9, 2019, 5:11am

Also, from a purely practical point of view, different applications just have different needs for the endpoints, and it is much easier to discard the 0 from [0,1) than to add it to (0,1) preserving uniformity.

Per · September 9, 2019, 7:59am

By my estimate, a CPU that does nothing but generate random numbers will come up with an exact zero every 1000 hours or so. (Obviously depends on the number of cores and the complexity of the algorithm that generates the underlying pseudorandom Int64s.)

So not that rare if you’re doing Monte Carlo simulations.

Edit: Just Googled the frequency of memory errors, and it seems to be the same order of magnitude (once per 1000 hours). Far more common than I would have guessed…

Topic		Replies	Views
Output distribution of rand(Float32) and rand(Float64), thread 2 Internals & Design float , random	437	6994	November 19, 2023
How to create a random Uniform Distribution between (but excluding) 0 and 10? New to Julia	20	27517	March 17, 2019
Random number in (0,1] General Usage	16	14991	October 8, 2018
Frédéric Goualard found an inexactness in our rand() Numerics random	19	1037	September 8, 2022
Output distribution of rand() Internals & Design	7	2234	October 19, 2023

Is it really possible for rand to throw a zero?

Related topics