I imagine this would also be computationally expensive to do with only 32 bits, or?

Converting a random string of 32 or 64 bits into 23 or 52 random bits of a uniform `[1,2)`

distribution only requires two vectorizable and fast CPU instructions.

Just set the sign and exponential bits correctly via a bitwise `&`

and then a bitwise `|`

.

How would you actually set the exponential bits for a non-ragged unfirom (0,1)?

There are an equal number of floating point numbers in the range [0.25, 0.5) as there is [0.5, 1) or [0.125, 0.25).

So it sounds like you’d need quote some logic in actually processing your remaining bits of entropy to create that correct distribution in the upper bits.

While this uses twice as many bits as `sizeof(T)`

, it should still be relatively fast:

```
using Random
function nonraggedrand(rng = Random.GLOBAL_RNG)
zero_sgnexp = 0x000fffffffffffff
zero_frac = 0xfff0000000000000
random_exponent = reinterpret(UInt64,rand(rng)) & zero_frac
random_fraction = rand(rng, UInt64) & zero_sgnexp
reinterpret(Float64, random_exponent | random_fraction)
end
```

Looks correct from a superficial glance:

```
julia> using RNGTest
julia> RNGTest.smallcrushJulia(nonraggedrand)
10-element Array{Any,1}:
0.5336216850647422
0.9230989723619558
0.49280958521636054
0.6834132908358886
0.8940787394692569
(0.5333583536888169, 0.5896538779593443)
0.37459169125750136
0.14003119092065008
0.7706765198832467
(0.5617563416418929, 0.9608245561823281, 0.37088087074827114, 0.839032775363961, 0.6894691545291662)
```

The idea is: the ragged randoms have the correct distribution of exponential bits. So, just combine those exponential bits with random fraction bits.

This uses 128 bits for a 64 bit random number, or 64 bits for a 32 bit random number. But I think that’s rather reasonable.

Many random number generators intentionally discard extra random bits anyway, to help with things like the “birthday problem”, or just for the sake of having a larger state for a longer period.

Maybe it’s a better idea to use a PCG RXS-M-XS generator, and discard state via combining floating points like this, than it is to use a PCG XSH-RS generator, for example?

Those generators can be fast and are vectorizable. If you’re only sampling serially, vectorization means the cost will be less than 2x for taking this approach.