Output distribution of rand(Float32) and rand(Float64), thread 2

I’m still missing a wholistic view on the motivation here. I totally get the theoretical attraction to being able to sample from all representable values, but I’m not really understanding the practical objectives and tradeoffs. Even Downey’s paper draft and the python docs are silent on this practical view.

  • There’s the not-outputting 0 thing. This part I get — not only does it cause the potential for a rare (or not-so-rare) singularity, its inclusion also ever-so-slightly biases the mean of rand to be less than 0.5. But it’s also trivial to do rejection sampling while being highly impractical to do the inverse. Ultimately, its exclusion was considered and declined, even in the face of — at the time — a near-zero performance cost and improved overall statistics.
  • So then there’s increasing the density of the possible sampled points from the real line (prior to rounding). Potential numbers for this density are:
    • 2^{-11}: Float16 status quo
    • 2^{-24}: Float32 status quo (and the dense sampling of all possible Float16 values including subnormals)
    • 2^{-52}: Float64 before Julia v1.7
    • 2^{-53}: Float64 status quo
    • 2^{-76}: The optimized Float64 algorithm from the blog post on corsix.org
    • 2^{-126}: Dense sampling of all possible normal Float32 values
    • 2^{-149}: Dense sampling of all possible Float32 values, including subnormals
    • 2^{-1022}: Dense sampling of all possible normal Float64 values
    • 2^{-1074}: Dense sampling of all possible Float64 values, including subnormals

Where the line for “good enough” is drawn will obviously depend on what you’re doing with the random numbers. So what are some example lines in the sand? Are there algorithms that would have significant error with 2^{-53} but not with some 2^x?

9 Likes