Output distribution of rand(Float32) and rand(Float64), thread 2

I’m only mildly worried about that case – you’re going to get Inf within milliseconds of rand(Float32) invocations.

I am much more worried about:

  1. The grad student who uses your randno0(Float32), applies a singular inverse cdf or something like it, gets silently subtly wrong tail statistics and unwittingly pollutes the scientific literature (edit: or loses a year of their phd chasing wild geese, your pick what’s worse).
  2. The grad student who uses rand(Float64), dutifully tests this on their laptop over the weekend, dutifully applies for and obtains a rare spot on their local cluster, and not-so-promptly blows up with an Inf. 2^52 might look like a large number, but that’s a mere teraflop-hour. I feel old now, what a preposterous combination of words “mere teraflop-hour” (of course most time will be spent doing useful stuff, not generating random numbers; if you assume that RNG takes 0.1% of the compute time, then you blow up in 1k teraflop-hours which I think is a less preposterous amount of compute that we old-timers like to believe…)
1 Like

We could also simplify this by choosing to let the “bits gradually run out” — instead of using loops, just hard code maybe three or four highly unlikely nested branches of the thing. It’ll be easier to prove correct that way — and we don’t need to worry about counting the bits when we get close to rounding to zero (which will never happen anyway!).

Here’s a table of interesting values of N (using my usual assumption of worst-case bit usage with N bits), showing at what point values of p are densely sampled — or, in other words, the regions of [0, 1) which have a perfectly accurate \text{cdf} for all representable values.

N P(= 0) Float16 Float32 Float64
11 0.00049 p ≥ 0.5 - -
24 5.96e-08 p ≥ Float16(6.104e-5) p ≥ 0.5 -
32 2.33e-10 p ≥ 0 p ≥ 0.001953125f0 -
53 1.11e-16 p ≥ 9.313226f-10 p ≥ 0.5
64 5.42e-20 p ≥ 4.5474735f-13 p ≥ 0.000244140625
126 1.18e-38 p ≥ 9.8607613f-32 p ≥ 5.293955920339377e-23
128 2.94e-39 p ≥ 2.4651903f-32 p ≥ 1.3234889800848443e-23
149 1.4e-45 p ≥ 0 p ≥ 6.310887241768095e-30
196 9.96e-60 p ≥ 4.484155085839415e-44
256 8.64e-78 p ≥ 8.636168555094445e-78
1022 2.23e-308 p ≥ 1.0020841800044864e-292
1074 5.0e-324 p ≥ 0

We can’t know how folks will (ab)use rand() but we can get rand() itself right. And then we can give appropriate guidance on how to use it.

8 Likes

With the current generator within about 4 billion floats so about 16 seconds on a single core. But with an ideal Float32 you’d never get it even on your teraflop cluster for millions of years.

Similarly on an ideal Float64 generator you’d also never get it even more so.

We don’t need perfection but I’d argue based on the table Matt made we need N = 128 at a minimum to avoid observing real deviations in the cdf for small probability events with big clusters

Of course we don’t need to generate 128 each generation, we just need that resolution in the left tail. rand32_for offered that and only generated more than 32 bits once in 4 billion numbers.

1 Like
julia> foo() = while rand(Float32)!= 0 end 
foo (generic function with 1 method)
julia> @benchmark foo()
BenchmarkTools.Trial: 269 samples with 1 evaluation.
 Range (min … max):  44.633 μs … 108.039 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     13.281 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   18.660 ms ±  17.791 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▅▅▃▃▆▆▁▆             ▂                                       
  ███████████▇▇█▇▇▆▅█▃▇▆█▅▄▄▅▆▄▅▅▃▃▁▁▄▃▄▁▁▄▄▁▁▃▃▃▁▁▃▁▃▃▄▄▁▁▁▃▃ ▄
  44.6 μs         Histogram: frequency by time         74.3 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

Hunh, am I not understanding what rand(Float32) is doing or am I being too cavalier about an estimate of the time it should take?

I’m imagining

function rand(Float32) 
   i = rand(UInt32)
   return ldexp(Float32(i),-32)
end

That should return 0 on avg every 4 billion generations right?

Note, I guess the time should be geometrically distributed… So my estimate is very coarse still I’d expect hundreds of milliseconds or a second or something


julia> function randdan()
              i = rand(UInt32)
              return ldexp(Float32(i),-32)
              end
randdan (generic function with 1 method)

julia> foo() = while randdan() != 0 end
foo (generic function with 1 method)

julia> using BenchmarkTools

julia> @benchmark(foo())
BenchmarkTools.Trial: 1 sample with 1 evaluation.
 Single result which took 28.241 s (0.00% GC) to evaluate,
 with a memory estimate of 0 bytes, over 0 allocations.

Ok, so looking into it in share/julia/stdlib/v1.9/Random/src/generation.jl we have:

rand(r::AbstractRNG, ::SamplerTrivial{CloseOpen01{Float32}}) =
    reinterpret(Float32, rand(r, UInt23()) | 0x3f800000) - 1

So I guess it’s worse than I thought.

That’s the fallback code for RNGs that only implement the minimal interface. The relevant code for the default Xoshiro RNG is here: https://github.com/JuliaLang/julia/blob/2b73a1d6c64258fdbfa3d44cc2604e841b666120/stdlib/Random/src/Xoshiro.jl#L299-L300. It looks like this:

rand(r::Union{TaskLocalRNG, Xoshiro}, ::SamplerTrivial{CloseOpen01{Float32}}) =
    Float32(rand(r, UInt32) >>> 8) * Float32(0x1.0p-24)

It throws away 8 bits of a UInt32 such that every value of the remaining 24 maps exactly to one Float32 without rounding.

1 Like

These kinds of arguments that 0.0 is uniquely “dangerous” make no sense to me. If you are doing an integral (or using a cdf) with a singularity in the domain, or at the edge of the domain, you need to be careful, no matter what method you are using.

If the student uses the trapezoidal rule to do their \int_0^1 with a singularity at 0, they will also get Inf — is that a bug in the trapezoidal rule? Even if they use something like QuadGK that in principle handles (integrable) endpoint singularities, they might get a failure if they request too low of an error tolerance, because singularities degrade the convergence rate (if they are not taken into account analytically).

Besides, you can have a singularity anywhere — if someone tries to integrate f(x) / \sqrt{|x - 0.25|} for x \in [0,1] by a Monte–Carlo method, they will get Inf if x = rand() ever returns x == 0.25. Should we therefore say that rand() should never return the “dangerous” value of 0.25?

7 Likes

[Phew… I was beginning to think this thread will die out :stuck_out_tongue:]

This 0.0 issue is idiosyncratic and controversial enough to make a poll interesting:

  • rand(Float32) can output 0.0
  • rand(Float32) can NOT output 0.0
0 voters

I’m putting my vote in.

I think the argument comes down to how often people make the assumption that the left tail of [0,1) should have some detail to it. I think as an empirical question, this is probably a pretty common assumption. People put singularities near 0 specifically because they want to transform stuff near zero to stuff “near” infinity. It comes down to a question of frequency in real world code. We’ve already seen Julia had this kind of stuff in sampling exponentials (since bugfixed). People do if (rand() < epsilon) all the time, even if they should do if rand(Bernoulli(eps)) also Floating point numbers behave differently near 0 than they do near 0.25 or 1.

eps(.25f0) ~ 2.98e-8 whereas eps(0.0f0) ~ 1.0f-45

That we can represent small numbers near 0 could lead people to think that they should expect such numbers to come out of rand(Float32) currently though the smallest nonzero rand(Float32) is about ldexp(1.0,-24) or about 6e-8

As far as the poll, I voted that it can output 0.0 but I still think it should do so with very small frequency, like ldexp(1.0f0,-64) or less

But,

is a bug since rand(Bernoulli(epsilon)) uses <=, equality included (it is one character more, but something must be decided).

Ok, here’s another poll

Must rand() and rand!() produce the same float distribution?

  • Yes they must be identical
  • rand!() could use a different float distribution but still should have broadly similar dynamic range
  • rand!() could be different and have less dynamic range if it creates enough speed via SIMD
0 voters

Obviously I don’t know how to set up polls, please choose ONE option.

suppose you do rand(Float32) < ldexp(Float32(pi),-18). I don’t think this is one of the possible outputs of rand() so we’re in the case where the cdf goes above the diagonal, so we’re going to get this occurring at the wrong frequency. The frequency is going to be off by on the order of 5.96e-8 which is a relative error of about .005 Is this also a “bug”?

As noted in a popular post earlier, we don’t expect crypto level of accuracy, just a general performance. But the 0.0 is different from a pole at 0.25, because rand() has a natural probability interpretation, and a run-time calculated probability of 0.0 or 1.0 will usually correspond to high level of certainty (it could appear in a p = clamp(calc_p, 0.0, 1.0) expression. If subsequently, rand(Bernoulli(p)) is calculated, one might reasonably expect it to return false on a 0.0 p always.

good point! rand(Bernoulli(0.0f0)) should ALWAYS return false! I think the <= test should not be used in Bernoulli

See Output distribution of rand(Float32) and rand(Float64), thread 2 - #143 by StefanKarpinski and Output distribution of rand(Float32) and rand(Float64), thread 2 - #147 by mbauman for more on when it’s appropriate to use < vs <=.

3 Likes

From:

if you’re sampling from (0, 1) then I’m not sure what you
should do.

So not much help regarding rand(Bernoulli(p)) problem. Although, rand(Bernoulli()) uses Float64 IIRC and 0.0 is a more remote possibility.

I suspect the only thing that will help this thread come to a productive end is to consolidate the proposed algorithms and detail for each

  1. the bits of randomness
  2. the support (e.g. does it include 0, does it include every representable float)
  3. for both the domain (representable floats) vs (real interval [0,1])
    • the maximum difference between ideal cdf and empirical cdf
    • mean abs difference between ideal and empirical
    • uniformity of diff between ideal and empirical
    • “staircase graph” of empirical cdf
  4. performance concerns (i.e. is it branchy, loopy, simdable)

otherwise there will be another 100 posts on whether or not log(rand()) is a user mistake or not

More and more I’m beginning to like this convolution style of random number generation:

function float32conv()
    r = 2.0f0
    while r < 0.0f0 || r >= 1.0
        i = rand(UInt64)
        a = i >>> 32
        b = i & (1<<32 - 1)
        r = ldexp(Float32(a),-32) + ldexp(Float32(b),-32-15) - ldexp(1.0f0,-16)
    end
    r
end

This convolution distribution smooths out the “stairs” in the CDF throughout the range. I’m investigating why it seems to have a slight bias below the line in this region. I’m guessing it’s because of the nature of floating addition and roundoff… not sure.

Here is the ECDF for 5000 samples less than 1e-6, for rand(Float32) (staircase) and the convolution distribution (smooth blue) as well as the ideal line (green)

I think people are going to act like rand() produces something with a smooth cdf like this, not a staircase, because this is the mathematical model they have in their head, and they know that Float32 can represent it to the accuracy shown here.

EDIT: yes, by putting in some parentheses I can eliminate that bias:


function float32conv()
    r = 2.0f0
    while r < 0.0f0 || r >= 1.0
        i = rand(UInt64)
        a = i >>> 32
        b = i & (1<<32 - 1)
        r = ldexp(Float32(a),-32) + (ldexp(Float32(b),-32-15) - ldexp(1.0f0,-16))
    end
    r
end

Investigating further, the bias seems to persist by moving farther up the CDF… I’ll look into that more, I think it has to do with asymmetry of resolution in Float32 near 0 vs near 1… should be fixable.

Just needed to draw the diagram and be careful about where the uniform region is… this seems to fix the bias:

function float32conv()
    r = 2.0f0
    while r < 0.0f0 || r >= 1.0
        i = rand(UInt64)
        a = (i >>> 32) % UInt32
        b = reinterpret(Int32,i % UInt32)
        r = (ldexp(Float32(a),-32) + (ldexp(Float32(b),-31-15) - ldexp(1.0f0,-15)))/(1.0f0 - ldexp(1.0f0,-14))
    end
    r
end

It takes about 2x the time of our rand() but about 1.62 times the time of ldexp(Float32(rand(UInt32)),-32)

The smallest nonzero value this can produce is on the order of ldexp(1.0f0,-32-15) ~ 7e-15 and the cdf is obviously DRAMATICALLY better in the region below 1e-7

Here’s the region below 1e-6

Here’s the region below 1e-3

and the whole [0,1) region:

Can you do a log-log plot?