# Is it really possible for rand to throw a zero?

5 Likes

Yes, and the difference between exact zero and the next number in line might be an index error which crashes the program or not.

Think of binning data and 0.0 falls into the "zero"th bin:

``````julia> a = 'A':'Z'
'A':1:'Z'

julia> a[ceil(Int, rand() * length(a))]
'K'
``````

versus (if you are unlucky)

``````julia> a[ceil(Int, (1-rand()) * length(a))]
ERROR: BoundsError: attempt to access 26-element StepRange{Char,Int64} at index [0]
``````

In Julia, you’d write `rand(a)` rather than `a[ceil(Int, rand() * length(a))]`, so this is more of a problem in other languages.

3 Likes

Or if one absolutely wants the index, `rand(eachindex(a))`.

1 Like

The `1/x` case (mentioned somewhere above) is interesting, because it is not sufficient to simply filter out `0.0`. If the zero is important enough to matter, then the interval between zero and `eps(T)` is probably also important, but rand(T) will only return multiples of `eps(T)`.

I think it’d be fairly straightforward to write a `rand` function which might return any representable float in [0,1], so that the probability of getting a zero is equal to `nextfloat(zero(T))`, i.e. 5.0e-324 for `Float64`. A naïve implementation would run at about half the pace of the standard random number generator, but it might be possible to make it almost as fast.

1 Like

Returning subnormals from rand is likely to slow things down on many microprocessors.

Are you distinguishing “other important bits” from other bits in unused portions of memory?

Only about one in 10^308 returned numbers would be subnormal, so it would not affect performance in any measurable way. (Of course, that’s also a reason not to bother with returning subnormal numbers if it requires any extra coding effort.)

For most practical problems I can think of where 1/x is needed, either one uses a bounded f(1/x) (so that the expectation is finite) and/or the method is ill-behaved anyway so that one would use a transformation. I think that it is important to consider only examples where expected values a finite and preferably second moments also exist, otherwise MC is a dubious proposition anyway.

But that’s a red herring: if think the Julian solution is to implement a sampler for a custom `Open01` object, and then the semantics are clear. This could of course live in a package.

1 Like

No… This is a simplified example how an index error might happen. Not an example how to draw a random element from a vector (yes, you wouldn’t do it this way.)

Another possibility for getting (0,1] is

``````openclosed(t...) = (r = rand(t...); iszero(r) ? one(r) : r)
``````

which simply switches out the zero value with one.

1-rand() will probably be faster.

3 Likes
``````julia> @btime openclosed()
3.848 ns (0 allocations: 0 bytes)
0.33216440795825175

julia> @btime y() # 1-rand(t...)
3.657 ns (0 allocations: 0 bytes)
0.04763324146749448
``````

How do the timings compare for `openclosed(n)` and `y(n)` where n is 100, 1_000, 10_000?

`y(n)` errors out, while `openclosed(100)` is valid code

Ideally we could have a `RandomMode` like we have a `RoundingMode`, no?

Like others said, yes, and this definition doesn’t apply only to `Float64`, so it’s not only theoretical. For example, the array `rand(Float16, 2^12)` has a pretty good chance to contain `0`.

One pretty efficient solution to generate in `(0, 1)` is to set the least significant bit to `1` (which is always set to `0` by default), e.g. `reinterpret(Float64, reinterpret(UInt64, rand(rng, Float64)) | 0x0000000000000001)`. I believe this should keep the distribution mostly uniform. This is what the `RandomExtensions.jl` package does (it provides few related distributions, like `OpenOpen`, `CloseOpen` etc.).

3 Likes

From that point of view then the best choice would be [0,1] not [0,1) because it is easier to discard 1 and 0.

It would be easier to discard 1 and 0, but it has other advantages to let vanilla `rand()` fit into a half open interval.

As I said above, it would probably make sense to customize this with just different samples — there is no universal best choice.

That said, if you look at the source, drawing a float in [0, 1) can be done with a some rather elegant bit-fiddling, so in that sense it is rather special and reasonable as a default.

2 Likes