@StefanKarpinski I agree, and it’s more or less the argument I thought that was always being made, so I like that we’re converging on some common understanding.
Here’s some data on timing for my slightly modified code above:
julia> function rand32()
i = rand(UInt64)
a = i & 0xffffffff
b = i >> 32
if a == 0x0
if b == 0 ; b= 1 ; end
c = ldexp(Float32(b),-64)
else
c = ldexp(Float32(a),-32)
end
return c
end
rand32 (generic function with 1 method)
julia> @btime(rand32())
4.308 ns (0 allocations: 0 bytes)
0.4268289f0
julia> @btime(rand(UInt64))
2.866 ns (0 allocations: 0 bytes)
0xbcb048922324e5f6
julia> 4.308/2.866
1.5031402651779482
So it’s about 1.5 times the cost compared to just generating the 64 bits, and improves the left tail a LOT, the smallest nonzero value it can generate is ldexp(1.0f0,-64) compared to I think ldexp(1.0,-24) for the current code?
The smallest Float32 on the other hand is about 2.58f-26 as big as ldexp(1.0f0,-64)
So we have the option also to extend the dynamic range even further, we could do something like ldexp(Float32(rand(UInt32)),-64-32) as a 3rd step of extending the dynamic range even farther, it would only get hit once every 2^64 calls.