Output distribution of rand()

I wanted to ask about the output distribution of floating-point rand(): Naively, I would have expected the following distribution:
-the sign bit is zero
-the mantissa is a random bitstring
-the signigicant is an exponentially distributed integer, e.g. const-leading_zeros of an infinite bitstream
-subnormals can be rounded to zero, but should not appear in real life for 32-bit or 64-bit numbers (probability of 2^(-2^(7)) for Float32 which is approximately never)

This corresponds to sampling a random real number in [0,1], according to uniform distribution, and rounding down to the next representable floating point number.

In reality, this is not the case:

julia> function ff(N,k, fT,iT)
              for n=1:N
              if r < 2.0^(-k)
              rr=reinterpret(iT, r)
              @show r
              @show rr
              i>10 && return nothing
              end end

julia> srand(42); ff(1<<25, 24, Float64, UInt64)
r = 5.29506605229102e-8
rr = 0x3e6c6d7bbe000000
julia> srand(42); ff(1<<25, 24, Float32, UInt32)
r = 0.0f0
rr = 0x00000000
r = 0.0f0
rr = 0x00000000
r = 0.0f0
rr = 0x00000000
r = 0.0f0
rr = 0x00000000

From this it is pretty clear that a more naive construction is used; e.g. setting the mantissa as a random bitstring, and float-subtracting one, which corresponds to setting mantissa0=rand52; significant=const+leading_zeros(mantissa0); mantissa = mantissa0 << (leading_zeros(mantissa0)+1), which shifts in all-zero bits from the right, instead of shifting in random bits.

Is this distribution standard / intended?

Should this be documented? (i.e. the fact that only 23/52-bit fixed-point numbers are generated and the the probability of rand(Float32)==0 is very large)

Is this distribution standard / intended?

This is how dSFMT, which is used internally works: it only generates 52 random bits at a time, and uses that to produce a random Float64 in [1, 2), to which 1 is substracted. Nowadays, we do the substraction on the Julia side. So yes this is intended, and generating a random float in [0,1) in a more perfect way in software (given uniform random bits packed in an UInt64) would be much less time efficient that this hack. I’m no expert at all on the topic, but this looks very similar to using x = a+rand()*b to get a random number in [a, a+b), which is a relatively widespread technique in scientific packages (and in “modern” C++ – at least the g++'s standard library – as well if I remember correctly).

1 Like

Thanks. A “correct” [0,1)-float would take 54 bits of entropy, instead of 52; alas, with a variable bit-rate encoding.

Reason for asking was that I was thinking about submitting a PR for a CSPRNG, and as far as I understood, there is currently no generic structure that does buffering and conversion-to-float from a random bitstream?

Using the native instructions, aes128-ctr has roughly the same speed as dsfmt for generating uint64 (on julia 0.7, on my machine, and with caching of all the roundkeys, which amounts to <12 x i128>=192 bytes of state).

Without hardware support, one could use chacha20/12; I think writing aes in software is too error-prone if avoidable (timing side-channels! DJB’s crypto-suites are cool for being really hard to mess up).

Re hack and subtraction: The principled way could be a hack along the lines of

julia> function getfloat(r::UInt64)
       mo = leading_zeros(r)
       if mo >11
       mo = 12 + leading_zeros(rand(UInt64))
       pref = reinterpret(UInt64,1022-mo)<<52
       return pref | ((r <<12)>>12)

This has a small bias that it avoids really tiny random numbers (smaller than 2^(-76) that should appear once-in-2^76; fixable by using a while loop). The if-block triggers with a probability of 2^(-12) only, so it should not dominate the cost.

edit: I think the real performance problem is that the branch can prevent simd :frowning:

Ok, I did some benchmarking. The distribution is mainly a problem for Float32. I think the following variant is competitive, and gives a much smoother distribution near 0, because we use the entire 32 bits of entropy:

julia> using Random
julia> using BenchmarkTools
julia> N=20_000; inp=rand(UInt32, N); outp = zeros(Float32,N);
julia> function getfloat_trail(r::UInt32)
              mo = Int32(trailing_zeros(r))
              pref = reinterpret(UInt32,Int32(126)-mo)<<23
              res = pref | (r>>9)
              return reinterpret(Float32,res)
julia> function getfloat_std(r::UInt32)
              rr = 0x3f800000 | (r>>9)
              res = reinterpret(Float32, rr)-1.0f0
              return res
julia> @btime Random.rand!($inp);
  17.782 μs (0 allocations: 0 bytes) # 1.7 cycles/number
julia> @btime Random.rand!($outp);
  21.121 μs (0 allocations: 0 bytes) # 2.1 cycles / number
julia> @btime broadcast!($getfloat_std, $outp, $inp);
  4.413 μs (0 allocations: 0 bytes) # 0.4 cycles / number
julia> @btime broadcast!($getfloat_trail, $outp, $inp);
  8.958 μs (0 allocations: 0 bytes) # 0.9 cycles / number

Note: This is really slow on my archlinux binary distribution of 0.6.3, and the above was from a recent master 0.7, built from source.