Output distribution of rand(Float32) and rand(Float64), thread 2

Personally, I’ve always been scared (perhaps unreasonably so) to muck around with rand()'s output. I’m not sure where this trepidation came from, but it predates Julia — I remember fretting about hand-written re-computations of random distributions in Matlab and C in the 00s. I don’t think it really matter whether the output distribution is evenly spaced or if it’s logarithmic; either way, floating point challenges aren’t too far away.

In my view, getting random distributions right is the work of magicians — along the same lines as getting encryption right is (if not even more fundamental). To wit, the straw-stuffed implementation in the first thread on this isn’t generating a correct uniform distribution AFAICS — so we don’t have a solid view on what such a performance impact would be, but it’ll certainly be worse than what’s there.

4 Likes

A common thread in many discussions is floating point is strange, difficult, yada, yada. The reason floating point is so dominant is varied, but a big part of it is QWERTY like intertia from a time before Julia’s type expressibility.

If you want a uniform [0,1) distribution, which is a common need, then:

using FixedPointNumbers
using Random

rightrand(rng = Random.default_rng()) = Fixed{Int64, 52}(Random.rand(rng, Random.UInt52Raw(UInt64)),0)

should do the trick, and is probably drop-in compatible almost everywhere because of conversion to Float under some operations.

For example:

julia> sum(rightrand() < 0.3 for i in 1:1000)
319

and

julia> rightrand() * 13.2
10.224973018628942

julia> rightrand() * 13.2 |> typeof
Float64

Fixing rand() to act this way would be almost invisible and an improvement IMHO.

Isn’t that effectively the same as the status quo’s evenly-spaced floating point output? What would the improvements be?

It is simpler to reason about, and distances programs from Float which eventually makes indeterminism and rounding errors creep in.

As a way forward, it seems better to revert to fixed-point and regain stuff like commutativity and associativity.

But this is might be a discussion for a separate thread.

No, that’s not the case now and won’t be the case in an improved implementation:

julia> function test_rand(b::Bernoulli, N)
           hits = 0
           misses = 0
           for i in 1:N
              sample(b) ? (hits += 1) : (misses += 1)
           end
           @show N, hits, hits / N, b.p
           @show N, misses, misses / N, 1 - b.p
           return nothing
       end
test_rand (generic function with 1 method)

julia> test_rand(Bernoulli(prevfloat(1f0)), typemax(Int32))
(N, hits, hits / N, b.p) = (2147483647, 2147483529, 0.9999999450519681, 0.99999994f0)
(N, misses, misses / N, 1 - b.p) = (2147483647, 118, 5.4948031927900404e-8, 5.9604645f-8)

The current issue on the low end is that the sample space of rand is much too sparse to reflect the precision with which one can specify a small frequency p::Float32. On the upper end, the available precision for p is lower but perfectly matched by the sample space, so the Bernoulli random variable is weighted according to specification.

1 Like

Do any bit-twiddlers here know how to accomplish the above in a performant, SIMD-able way? Alternatively r / 2^{32} rounded down, which would be the way to generate samples in [0, 1).

2^{-32} isn’t the smallest nonzero Float32, but it’s a lot closer to zero than the current smallest nonzero rand()-output, which is 2^{-23}. To me, a performant implementation of this conversion seems like a good compromise, basically making the most of the single UInt32 that a lot of RNGs produce as their raw output.

With the FixedPoint approach I’m advocating, a 30 bit uniform [0,1) is easily and efficiently sampled:

using FixedPointNumbers
using Random

rightrand(rng = Random.default_rng()) = Fixed{Int32, 30}(Random.rand(rng, UInt32)>>2,0)

Another benefit with this no-rounding approach is that you can actually think clearly about:
Prob(rand() + rand() == 1), something which in the floating point world with rounding you suggested, the answer might not be immediate.

julia> rightrand()+rightrand()
1.2446260424Q1f30

julia> sum(i->rightrand()+rightrand()==1, 1:2^31)
1

Can a 32-bit binary fixed-point number be used as an intermediate representation in a performant, SIMD-able implementation of rand(Float32)? That would be great, as it would presumably lead to code that’s readable and easy to understand. Basically, the conversion I’m talking about would then live in an efficient implementation of convert.

This does the rounding:

julia> rightrand()*1.0f0   # or rightrand()+0.0f0
0.09457736f0

But the fixed-point is basically a thin wrapper around doing calculation on Ints. The motivation is that many calculations (or the random number usage parts) can be kept in this fixed realm.

For example, if it is just used to flip a coin by comparing to some number (in any representation).

No doubt, but far from all, and even in a hypothetical scenario where rand() were changed to produce a fixed-point number, I don’t think rand(Float32/64) would be deprecated, so getting their implementation both right and performant is a relevant problem, and that’s what I think this discussion is really about (even if it says rand() in the title).

1 Like

That would be adequately approximated by false, given the amount of universe between now and heat-death. Bernoulli(1.0f-8) on the other hand expects ~ 10 events per core-second, which is not negligible.

People who do 1 - v and get surprised by the results cannot be helped by the standard library, they need code-review, or a good linter, and maybe a basic course in computer science / numerical computing. cf here

==============================

Thanks for spotting that point of confusion, I edited the title. The thread was tagged float from the very beginning.

==============================

A lot of people here mentioned Shannon Entropy of the output distribution, considered as bitstrings (i.e. no continuous stuff).

The extra dynamic range adds ca 1 bit of entropy. The annoying part for implementation is that the information content of output values is no longer uniform, and can exceed 32 / 64 bit. Hence we can either cheat in some way, or we’re stuck with a very unlikely branch that needs to call the random source again.

==============================

Over on hackernews there is a nice link to the python docs which show a good awareness of the issue.

==============================

I don’t want to discuss implementations too early, basically because there appears to be no consensus on whether the output distribution of rand(Float32) is “working as intended” or “regrettable oversight” or “regrettable tradeoff for performance / legacy”.

That being said, a lot of people discussing implementations are under the misapprehension that rand(Float32) must work with a random UInt32, for performance reasons. This was true in the olden days when I started the previous thread 5 years ago.

Today, the default RNG doesn’t bother – it will generate a random UInt64, then throw away half of it to get a random UInt32, and then make a Float32. In other words, we have lots of free random bits to play with.

After some more thinking, I believe an ugly compromise way forward could be:

  1. rand(Float64) excludes 0.0 from the output distribution, but is otherwise unchanged. Reason: exact zeros are liable to trigger bugs. The previous distribution generated exact zeros about once per core-year, so this was a super rare event anyways.
  2. rand(Float32) = convert(Float32, rand(Float64)) on CPU and SIMD. We used 64 bit of entropy for rand(Float64) anyways. This kind of float truncation / rounding has hardware support and is fast.
1 Like

Don’t think that excluding 0.0 is the way forward. Because, if you consider a “Float16” (or other small floats which are becoming popular) you need a more principled approach.

A mathematical approach would look at a quality measure for a float RNG as follows:

Error(RNG) = min max (prob(x)-length(N(x))
where max taken over all x \in DataType and
min taken over all functions N(x) : DataType → intervals of [0,1)
such that x \in N(x) and N(x) defines a partition of [0,1).

Now a tradeoff between Error and speed can be explored.

ADDED: Considering this measure of Error, removing 0.0 is indeed not crucial since it can have a prob(x)-length(N(x)) of upto nextfloat(0.0f0) which is small, compared to missing out on subnormals for example.

BFloat16 has the same issue as Float32, only much more acutely. FWIW, BFloat16.jl does the right thing, in that it rounds a Float64 to BFloat16, and therefore gives better dynamic range than Float32!

On Float16, I am not qualified to have an opinion.

Did you just pull that quality measure out of your hat or can I have a name / citation?

Because figuring out a good quality measure is highly nontrivial on a technical / mathematical level, and even more fraught on a social level.

Saying “measured by X, our approximation is good” is worthless unless you either intend to prove theorems using X or there is consensus in the community that X is a good way of measuring quality. You might need to go on a career-filling personal crusade to establish X as a measure of quality.

Just on the technical level: You want that defined as a measure of distance on probability measures on [0, 1]. Requesting a single metric (triangular equality) may be too much, but you absolutely want a uniformity and you want to understand the dual space well. You want finitary approximations of the standard Lebesgue measure to be possible. You want it to be possible to talk about relative errors as well as absolute errors, both in the interval where your prob measure lives, and with respect to the resulting probabilities. I think wanting a single number that measures approximation quality is a fools errand – you probably want to define a family of pseudo-metrics and will then need to talk about a family of estimates and prove various interpolation inequalities.

Sure, but the measure I’ve come up with is rather simple when parsed (maybe I didn’t specify it so clearly). As a simple measure it allows the users to follow and theorem provers to use. As for “goodness”, in this multi-objective optimization world (need to account for speed and even stable latency), simplicity is enough.

To simplify the measure, imagine sampling a true uniform [0,1) distribution, and then rounding to a close element of DataType (i.e. Float32). The rounding can be arbitrary, but it must be to a close element. Now each element of the DataType will have a certain probability (the measure of its basin of attraction). We evaluate the goodness of the RNG by comparing the probabilty of said element to this true probability by the most favorable rounding procedure. And consider the worst case element of DataType to get the final Error(RNG).

This in words, is the measure Error(RNG) from my previous post.

not if you want rand to be suitable for cryptographic purposes. although to be fair, that is a difficult bar to pass

I think what you came up with is mostly equivalent to levy metric

But I am arguing strenuously that x + epsilon with epsilon determined by the number of mantissa bits is conceptually wrong for floating point numbers – it misses the point of floating the point, and we must do better.

We need x * (1+epsilon_1) + epsilon_2, where epsilon_1 is determined by the number of mantissa bits and epsilon_2 is determined by the number of exponent bits.

This is because the precision of all other floating point operations is measured by this (epsilon_1, epsilon_2) pair, in order to allow users to write things like 1/x with impunity.

Will check this out.

Since the float datatypes are already mapped to the real numbers, we don’t have much freedom in choosing probabilities except for playing with rounding methods. The 1/x problem is exacerbated by losing many small values of float (as their density near 0.0 is much higher), but this is actually reflected in the quality measure I’ve mentioned. Specifically, sampling the denormal floats will improve this.

I’m still missing a wholistic view on the motivation here. I totally get the theoretical attraction to being able to sample from all representable values, but I’m not really understanding the practical objectives and tradeoffs. Even Downey’s paper draft and the python docs are silent on this practical view.

  • There’s the not-outputting 0 thing. This part I get — not only does it cause the potential for a rare (or not-so-rare) singularity, its inclusion also ever-so-slightly biases the mean of rand to be less than 0.5. But it’s also trivial to do rejection sampling while being highly impractical to do the inverse. Ultimately, its exclusion was considered and declined, even in the face of — at the time — a near-zero performance cost and improved overall statistics.
  • So then there’s increasing the density of the possible sampled points from the real line (prior to rounding). Potential numbers for this density are:
    • 2^{-11}: Float16 status quo
    • 2^{-24}: Float32 status quo (and the dense sampling of all possible Float16 values including subnormals)
    • 2^{-52}: Float64 before Julia v1.7
    • 2^{-53}: Float64 status quo
    • 2^{-76}: The optimized Float64 algorithm from the blog post on corsix.org
    • 2^{-126}: Dense sampling of all possible normal Float32 values
    • 2^{-149}: Dense sampling of all possible Float32 values, including subnormals
    • 2^{-1022}: Dense sampling of all possible normal Float64 values
    • 2^{-1074}: Dense sampling of all possible Float64 values, including subnormals

Where the line for “good enough” is drawn will obviously depend on what you’re doing with the random numbers. So what are some example lines in the sand? Are there algorithms that would have significant error with 2^{-53} but not with some 2^x?

9 Likes

x86 has a native instruction for int->float conversions. I imagine other architectures do, as well. So this had ought to be as simple as

uint_to_float(u::UInt32) = muladd(Float32(u), Float32(exp2(-32)), Float32(exp2(-33)))

This will round the result on conversion to Float32 and again during the addition part of the muladd (fma is not semantically necessary here because the multiplication is exact), both using the current rounding mode (almost certainly round-to-nearest). So it’s not the single rounding step you wanted, but probably plenty close for practical use. A high/low split of the UInt32 should permit you to do this with exact (or near-exact, at least) rounding, at the cost of some extra instructions. If you were simply willing to settle for Float32(u) * Float32(exp2(-32)) (i.e, no bias correction), you could avoid any rounding at all. Note that a high/low split or other technique is necessary for UInt16->Float16 because typemax(UInt16) > floatmax(Float16).

code_native(x->map(uint_to_float,x), (NTuple{1,UInt32},))
code_native(x->map(uint_to_float,x), (NTuple{8,UInt32},))

I notice that the vector case does not use a SIMD version of the conversion instruction. Instead it does a long-form conversion using different SIMD instructions. Maybe it isn’t available on my machine or maybe LLVM is being silly here, but I’ll leave that up to the compiler.

The problem here is that Float32(u) loses precision because Float32 can only represent every integer up to 2^{24}. Beyond that, Float32(u) will truncate, first to even numbers up to 2^{25}, then to multiples of 4 up to 2^{26}, and so on. The post on HN uses a loop to get around this, and I think cutting this loop short after the 32nd random bit is one way to accomplish what I’m suggesting.