I imagine this would also be computationally expensive to do with only 32 bits, or?

Converting a random string of 32 or 64 bits into 23 or 52 random bits of a uniform `[1,2)`

distribution only requires two vectorizable and fast CPU instructions.

Just set the sign and exponential bits correctly via a bitwise `&`

and then a bitwise `|`

.

How would you actually set the exponential bits for a non-ragged unfirom (0,1)?

There are an equal number of floating point numbers in the range [0.25, 0.5) as there is [0.5, 1) or [0.125, 0.25).

So it sounds like youâ€™d need quote some logic in actually processing your remaining bits of entropy to create that correct distribution in the upper bits.

While this uses twice as many bits as `sizeof(T)`

, it should still be relatively fast:

```
using Random
function nonraggedrand(rng = Random.GLOBAL_RNG)
zero_sgnexp = 0x000fffffffffffff
zero_frac = 0xfff0000000000000
random_exponent = reinterpret(UInt64,rand(rng)) & zero_frac
random_fraction = rand(rng, UInt64) & zero_sgnexp
reinterpret(Float64, random_exponent | random_fraction)
end
```

Looks correct from a superficial glance:

```
julia> using RNGTest
julia> RNGTest.smallcrushJulia(nonraggedrand)
10-element Array{Any,1}:
0.5336216850647422
0.9230989723619558
0.49280958521636054
0.6834132908358886
0.8940787394692569
(0.5333583536888169, 0.5896538779593443)
0.37459169125750136
0.14003119092065008
0.7706765198832467
(0.5617563416418929, 0.9608245561823281, 0.37088087074827114, 0.839032775363961, 0.6894691545291662)
```

The idea is: the ragged randoms have the correct distribution of exponential bits. So, just combine those exponential bits with random fraction bits.

This uses 128 bits for a 64 bit random number, or 64 bits for a 32 bit random number. But I think thatâ€™s rather reasonable.

Many random number generators intentionally discard extra random bits anyway, to help with things like the â€śbirthday problemâ€ť, or just for the sake of having a larger state for a longer period.

Maybe itâ€™s a better idea to use a PCG RXS-M-XS generator, and discard state via combining floating points like this, than it is to use a PCG XSH-RS generator, for example?

Those generators can be fast and are vectorizable. If youâ€™re only sampling serially, vectorization means the cost will be less than 2x for taking this approach.