I wanted to ask about the output distribution of floating-point
rand(): Naively, I would have expected the following distribution:
-the sign bit is zero
-the mantissa is a random bitstring
-the signigicant is an exponentially distributed integer, e.g. const-
leading_zeros of an infinite bitstream
-subnormals can be rounded to zero, but should not appear in real life for 32-bit or 64-bit numbers (probability of 2^(-2^(7)) for Float32 which is approximately never)
This corresponds to sampling a random real number in [0,1], according to uniform distribution, and rounding down to the next representable floating point number.
In reality, this is not the case:
julia> function ff(N,k, fT,iT) for n=1:N r=rand(fT) i=0 if r < 2.0^(-k) rr=reinterpret(iT, r) @show r @show rr i+=1 i>10 && return nothing end end nothing end julia> srand(42); ff(1<<25, 24, Float64, UInt64) r = 5.29506605229102e-8 rr = 0x3e6c6d7bbe000000 julia> srand(42); ff(1<<25, 24, Float32, UInt32) r = 0.0f0 rr = 0x00000000 r = 0.0f0 rr = 0x00000000 r = 0.0f0 rr = 0x00000000 r = 0.0f0 rr = 0x00000000
From this it is pretty clear that a more naive construction is used; e.g. setting the mantissa as a random bitstring, and float-subtracting one, which corresponds to setting
mantissa0=rand52; significant=const+leading_zeros(mantissa0); mantissa = mantissa0 << (leading_zeros(mantissa0)+1), which shifts in all-zero bits from the right, instead of shifting in random bits.
Is this distribution standard / intended?
Should this be documented? (i.e. the fact that only 23/52-bit fixed-point numbers are generated and the the probability of
rand(Float32)==0 is very large)