I’m only mildly worried about that case – you’re going to get Inf within milliseconds of rand(Float32) invocations.
I am much more worried about:
The grad student who uses your randno0(Float32), applies a singular inverse cdf or something like it, gets silently subtly wrong tail statistics and unwittingly pollutes the scientific literature (edit: or loses a year of their phd chasing wild geese, your pick what’s worse).
The grad student who uses rand(Float64), dutifully tests this on their laptop over the weekend, dutifully applies for and obtains a rare spot on their local cluster, and not-so-promptly blows up with an Inf. 2^52 might look like a large number, but that’s a mere teraflop-hour. I feel old now, what a preposterous combination of words “mere teraflop-hour” (of course most time will be spent doing useful stuff, not generating random numbers; if you assume that RNG takes 0.1% of the compute time, then you blow up in 1k teraflop-hours which I think is a less preposterous amount of compute that we old-timers like to believe…)
We could also simplify this by choosing to let the “bits gradually run out” — instead of using loops, just hard code maybe three or four highly unlikely nested branches of the thing. It’ll be easier to prove correct that way — and we don’t need to worry about counting the bits when we get close to rounding to zero (which will never happen anyway!).
Here’s a table of interesting values of N (using my usual assumption of worst-case bit usage with N bits), showing at what point values of p are densely sampled — or, in other words, the regions of [0, 1) which have a perfectly accurate \text{cdf} for all representable values.
N
P(= 0)
Float16
Float32
Float64
11
0.00049
p ≥ 0.5
-
-
24
5.96e-08
p ≥ Float16(6.104e-5)
p ≥ 0.5
-
32
2.33e-10
p ≥ 0
p ≥ 0.001953125f0
-
53
1.11e-16
p ≥ 9.313226f-10
p ≥ 0.5
64
5.42e-20
p ≥ 4.5474735f-13
p ≥ 0.000244140625
126
1.18e-38
p ≥ 9.8607613f-32
p ≥ 5.293955920339377e-23
128
2.94e-39
p ≥ 2.4651903f-32
p ≥ 1.3234889800848443e-23
149
1.4e-45
p ≥ 0
p ≥ 6.310887241768095e-30
196
9.96e-60
p ≥ 4.484155085839415e-44
256
8.64e-78
p ≥ 8.636168555094445e-78
1022
2.23e-308
p ≥ 1.0020841800044864e-292
1074
5.0e-324
p ≥ 0
We can’t know how folks will (ab)use rand() but we can get rand() itself right. And then we can give appropriate guidance on how to use it.
With the current generator within about 4 billion floats so about 16 seconds on a single core. But with an ideal Float32 you’d never get it even on your teraflop cluster for millions of years.
Similarly on an ideal Float64 generator you’d also never get it even more so.
We don’t need perfection but I’d argue based on the table Matt made we need N = 128 at a minimum to avoid observing real deviations in the cdf for small probability events with big clusters
Of course we don’t need to generate 128 each generation, we just need that resolution in the left tail. rand32_for offered that and only generated more than 32 bits once in 4 billion numbers.
Hunh, am I not understanding what rand(Float32) is doing or am I being too cavalier about an estimate of the time it should take?
I’m imagining
function rand(Float32)
i = rand(UInt32)
return ldexp(Float32(i),-32)
end
That should return 0 on avg every 4 billion generations right?
Note, I guess the time should be geometrically distributed… So my estimate is very coarse still I’d expect hundreds of milliseconds or a second or something
julia> function randdan()
i = rand(UInt32)
return ldexp(Float32(i),-32)
end
randdan (generic function with 1 method)
julia> foo() = while randdan() != 0 end
foo (generic function with 1 method)
julia> using BenchmarkTools
julia> @benchmark(foo())
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 28.241 s (0.00% GC) to evaluate,
with a memory estimate of 0 bytes, over 0 allocations.
Ok, so looking into it in share/julia/stdlib/v1.9/Random/src/generation.jl we have:
These kinds of arguments that 0.0 is uniquely “dangerous” make no sense to me. If you are doing an integral (or using a cdf) with a singularity in the domain, or at the edge of the domain, you need to be careful, no matter what method you are using.
If the student uses the trapezoidal rule to do their \int_0^1 with a singularity at 0, they will also get Inf — is that a bug in the trapezoidal rule? Even if they use something like QuadGK that in principle handles (integrable) endpoint singularities, they might get a failure if they request too low of an error tolerance, because singularities degrade the convergence rate (if they are not taken into account analytically).
Besides, you can have a singularity anywhere — if someone tries to integrate f(x) / \sqrt{|x - 0.25|} for x \in [0,1] by a Monte–Carlo method, they will get Inf if x = rand() ever returns x == 0.25. Should we therefore say that rand() should never return the “dangerous” value of 0.25?
I think the argument comes down to how often people make the assumption that the left tail of [0,1) should have some detail to it. I think as an empirical question, this is probably a pretty common assumption. People put singularities near 0 specifically because they want to transform stuff near zero to stuff “near” infinity. It comes down to a question of frequency in real world code. We’ve already seen Julia had this kind of stuff in sampling exponentials (since bugfixed). People do if (rand() < epsilon) all the time, even if they should do if rand(Bernoulli(eps)) also Floating point numbers behave differently near 0 than they do near 0.25 or 1.
eps(.25f0) ~ 2.98e-8 whereas eps(0.0f0) ~ 1.0f-45
That we can represent small numbers near 0 could lead people to think that they should expect such numbers to come out of rand(Float32) currently though the smallest nonzero rand(Float32) is about ldexp(1.0,-24) or about 6e-8
As far as the poll, I voted that it can output 0.0 but I still think it should do so with very small frequency, like ldexp(1.0f0,-64) or less
suppose you do rand(Float32) < ldexp(Float32(pi),-18). I don’t think this is one of the possible outputs of rand() so we’re in the case where the cdf goes above the diagonal, so we’re going to get this occurring at the wrong frequency. The frequency is going to be off by on the order of 5.96e-8 which is a relative error of about .005 Is this also a “bug”?
As noted in a popular post earlier, we don’t expect crypto level of accuracy, just a general performance. But the 0.0is different from a pole at 0.25, because rand() has a natural probability interpretation, and a run-time calculated probability of 0.0 or 1.0 will usually correspond to high level of certainty (it could appear in a p = clamp(calc_p, 0.0, 1.0) expression. If subsequently, rand(Bernoulli(p)) is calculated, one might reasonably expect it to return false on a 0.0p always.
More and more I’m beginning to like this convolution style of random number generation:
function float32conv()
r = 2.0f0
while r < 0.0f0 || r >= 1.0
i = rand(UInt64)
a = i >>> 32
b = i & (1<<32 - 1)
r = ldexp(Float32(a),-32) + ldexp(Float32(b),-32-15) - ldexp(1.0f0,-16)
end
r
end
This convolution distribution smooths out the “stairs” in the CDF throughout the range. I’m investigating why it seems to have a slight bias below the line in this region. I’m guessing it’s because of the nature of floating addition and roundoff… not sure.
Here is the ECDF for 5000 samples less than 1e-6, for rand(Float32) (staircase) and the convolution distribution (smooth blue) as well as the ideal line (green)
I think people are going to act like rand() produces something with a smooth cdf like this, not a staircase, because this is the mathematical model they have in their head, and they know that Float32 can represent it to the accuracy shown here.
EDIT: yes, by putting in some parentheses I can eliminate that bias:
function float32conv()
r = 2.0f0
while r < 0.0f0 || r >= 1.0
i = rand(UInt64)
a = i >>> 32
b = i & (1<<32 - 1)
r = ldexp(Float32(a),-32) + (ldexp(Float32(b),-32-15) - ldexp(1.0f0,-16))
end
r
end
Investigating further, the bias seems to persist by moving farther up the CDF… I’ll look into that more, I think it has to do with asymmetry of resolution in Float32 near 0 vs near 1… should be fixable.
Just needed to draw the diagram and be careful about where the uniform region is… this seems to fix the bias:
function float32conv()
r = 2.0f0
while r < 0.0f0 || r >= 1.0
i = rand(UInt64)
a = (i >>> 32) % UInt32
b = reinterpret(Int32,i % UInt32)
r = (ldexp(Float32(a),-32) + (ldexp(Float32(b),-31-15) - ldexp(1.0f0,-15)))/(1.0f0 - ldexp(1.0f0,-14))
end
r
end
It takes about 2x the time of our rand() but about 1.62 times the time of ldexp(Float32(rand(UInt32)),-32)
The smallest nonzero value this can produce is on the order of ldexp(1.0f0,-32-15) ~ 7e-15 and the cdf is obviously DRAMATICALLY better in the region below 1e-7