Output distribution of rand(Float32) and rand(Float64), thread 2

StefanKarpinski · October 28, 2023, 4:51pm

I do think that if the value would be larger than floatmax then it should probably be an Inf, so the implementations seems correct to me. It could maybe use an option to resample in that case, but it’s easy to do on the user end.

foobar_lv2 · October 28, 2023, 5:02pm

But the probability for that is completely wrong. Your algorithm returns Inf with a probability of ldexp(1.0, -64) == 5.421010862427522e-20, i.e. likely to happen in human history, while the correct probability is closer to ldexp(1.0, -1022) = 2.2250738585072014e-308, i.e. not gonna happen.

mbauman · October 28, 2023, 5:21pm

No, you’re just interpreting it wrong. The probabilities aren’t about the exact value but rather the likelihood it’s between that and the next value. How big that interval is depends on that N I keep talking about. Transformations like 1/x and log(x) change from a left-closed (test with <) to a right-closed (test with <=) distribution.

dlakelan · October 28, 2023, 6:56pm

julia> floatmax(1.0f0)
3.4028235f38

The logprobability to be greater than floatmax for an exponential distribution is:

julia> logccdf(Exponential(),Float64(floatmax(1.0f0)))
-3.4028234663852886e38

exp(-3e38) is really REALLY small. it should NEVER happen no matter how large a cluster you run on.

foobar_lv2 · October 28, 2023, 6:59pm

I was pointing out that randexp has a very small but non-negligible probability of returning Inf; somewhere around ldexp(1.0, -64).

@StefanKarpinski pointed out that randexp generating Inf is not necessarily a bug – the geometric distribution can generate a value > floatmax, which would correctly round to Inf.

To that I answered that randexp returning something larger than floatmax should be much, much more unlikely. In fact, I computed it wrong!

The probability of generating something > floatmax is on order exp(-floatmax), i.e. 2^(-2^1000). This is a probability that is much smaller than 2^(-64).

It is comparable to 2^(-1000) only in so far as both 2^(-1000) and 2^(-2^1000) are both equal to never in our finite universe.

dlakelan · October 28, 2023, 7:21pm

Ok, from R here’s a link with no preview

They call exp_rand and then scale it… so exp_rand is:

exp_rand link with no preview

This is a bizarre and tremendously slow implementation I think? They take a uniform random number and then add it to itself in a loop until it exceeds 1… if u is 1e-20 that could be 1e20 cycles? (EDIT: Oh no of course, they’re doubling it every time, so it would take around 20 cycles) ~~obviously low probability but worst case it’s really bad?~~ there’s some other stuff going on, I don’t understand the point of it all, but I suspect they value correctness vastly more than speed here.

Sukera · October 28, 2023, 7:31pm

Please be aware that those links appear to link to GPL code - a direct translation to Julia may not comply with the MIT license the language currently uses.

dlakelan · October 28, 2023, 7:35pm

Note that they also generate a unif_rand and exclude 0 or 1 from their algorithm. If they got a 0 their algorithm wouldn’t terminate.

I’m not suggesting a direct translation just to examine what the distribution looks like and if it excludes some of these problematic possibilities (such as Inf). It clearly does.

foobar_lv2 · October 28, 2023, 7:48pm

Have we reached consensus that randexp outputting Inf is, in fact, a bug?

The actual point I wanted to make here is not that we should fix randexp.

The point I wanted to make is that a rand distribution that outputs zero is like handing a loaded gun to a group of toddlers. It may not blow off somebodies face today, but it is a bad idea.

I was doubted on that being a bad idea, a la

REAL programmers don’t write buggy code with rand() that outputs zero

So I tried to bring some supporting evidence on the badness of the idea. The only empirical evidence I could come up with, besides repeating the same argument again and again, is pictures of people shooting off their faces. The faceless corpse I brought was randexp, and my argument was “not even the Random module itself is good at zero-safety, there is no hope for other user code”.

Am I supposed to bring other supporting evidence for my claim that including zero in the output is a bad idea?

What kind of evidence would be convincing to you?

More examples? Examples in “serious REAL PROGRAMMER” code? Does it get more serious than stdlib?

dlakelan · October 28, 2023, 7:50pm

It’s conceivable that their algorithm is slower than just calling -log() but It does guarantee you get a valid float.

julia> function rexp()
       u=rand()
       while u <= 0.0 || u >= 1.0
       u = rand()
       end
       return -log(u)
       end
rexp (generic function with 1 method)

julia> using BenchmarkTools

julia> @btime rexp()
  17.247 ns (0 allocations: 0 bytes)
0.26927999728116886

That’s for double floats… for 32 bit:

julia> function rexp32()
       u=rand(Float32)
       while u <= 0.0f0 || u >= 1.0f0
       u = rand(Float32)
       end
       return -log(u)
       end
rexp32 (generic function with 1 method)
julia> @btime rexp32()
  16.092 ns (0 allocations: 0 bytes)

dlakelan · October 28, 2023, 7:51pm

Yes 100% and I’m 100% with you that we should sample uniforms on (0,1)

I doubt very much that anyone has any code that relies on rand(Float32) returning 0.0f0 much less rand(Float64) returning 0.0

these are so unlikely that it’d be stupid to have your code break if it doesn’t happen. However, it’s not stupid to think that people’s code would break if 0.0 does happen as we’ve already seen.

wow that’s hard to parse. what I’m trying to say is that excluding 0.0 is the right thing to do and it’s only likely to FIX bugs not cause them.

Sukera · October 28, 2023, 8:09pm

Sure, but please also be aware that the link preview contains that same GPLd source code, and some folk may not appreciate looking at that inadvertently while scrolling through a random (pun not intended) thread on this discourse. You could place those links behind a spoiler tag for example, to remove that possibility.

dlakelan · October 28, 2023, 8:15pm

If you like I could do that. It’d be nice to just be able to post a link without Discourse creating a preview. I certainly didn’t tell it to create that preview. Also, none of the actual visible code is relevant to the algorithm, the first link is just a shell function that calls the real work, the second link is just preview of comments.

done, I just created text links.

StefanKarpinski · October 28, 2023, 8:27pm

For those advocating sampling from (0, 1), please provide a proposed support set along with a justifiable uniform distribution criterion on that support set. Or put another way, how, in theory, are you rounding all the numbers in the mathematical set (0, 1) to the proposed support? So far the arguments have been “exclude it” with no further detail. Keep in mind that you want either P(rand() < p) = p or P(rand() ≤ p) = p for the very common use case of doing Bernoulli sampling and you want to be able to tell users to do one of those two things and get properly distributed Bernoulli variables.

dlakelan · October 28, 2023, 8:41pm

what’s wrong with (where _rand() does the real work of something like [0,1] or [0,1) or etc )

function rand()
   u = _rand()
   while(u <= 0 || u >= 1)
      u = _rand()
   end
   return u
end

We can think of this as sampling from a grid rand(UInt64)/2^64 and just not accepting anything small enough to round down to 0. Yes this is a bias. It’s a bias that’s acceptable and better than being able to generate singular results (such as Inf) in my opinion. The alternative, rounding to 0 also induces a bias for rand(Exponential()) as seen already which is a worse bias

mbauman · October 28, 2023, 8:47pm

That’s not how consensus works

I’m absolutely, most certainly against this idea, and further arguments about 0s are not going to convince me otherwise. I totally get the pragmatism but it fundamentally breaks the probabilities and meaning of rand.

What could convince me? If you could come up with a coherent explanation about what the values and their likelihood means. That’s what I’m missing. Or maybe if we are able to increase N enough that such a problem is unobservable. But that might not be practical.

dlakelan · October 28, 2023, 8:52pm

No it doesn’t. If you try to sample from rand() and get a value between 1e-9201 and 2e-9201 you will never get that value, yet probability tells you that this value must occur infinitely many times in an infinite sequence of samples. If you sample from mathematical interval [0,1] uniformly you will get NO samples exactly equal to 0.0 in an infinite sequence of samples.

Computer floats are only an approximation to real numbers. That they break down when very fine resolution is important is well known. Excluding 0.0 doesn’t damage the approximation to reals in any meaningful way (since sampling 0.0 is impossible mathematically and it’s already impossible with floats to get very very small floats). In some sense, rounding to 0 infinitely increases the probability of getting something impossible mathematically.

adienes · October 28, 2023, 8:55pm

in some sense, if you sample from actual dense interval, you will get NO samples exactly equal to any value

mbauman · October 28, 2023, 8:55pm

Our current rand() would give you a zero because it falls in the bin [0, 2^-53). That’s what rand currently means. See Output distribution of rand(Float32) and rand(Float64), thread 2 - #147 by mbauman

dlakelan · October 28, 2023, 8:57pm

Right, and my version would never give you anything, the smallest thing it’d give you for Float32 is 1.0f-45

which is to say that we damaged the distribution about as much as using Floats32 damages the distribution. The bias/error is on the order of the minimum possible value it could be.

Topic		Replies	Views
Output distribution of rand() Internals & Design	7	2227	October 19, 2023
Frédéric Goualard found an inexactness in our rand() Numerics random	19	1036	September 8, 2022
Performance of generating uniform Float32 numbers General Usage question	4	273	November 6, 2023
Random number in (0,1] General Usage	16	14961	October 8, 2018
Generating Float32 uniform random numbers with Distributions.jl General Usage distributions , random	2	827	November 6, 2023

Output distribution of rand(Float32) and rand(Float64), thread 2

Related topics