Is it really possible for rand to throw a zero?

Think of it as an RNG thrown in for free :wink:

5 Likes

Yes, and the difference between exact zero and the next number in line might be an index error which crashes the program or not.

Think of binning data and 0.0 falls into the "zero"th bin:

julia> a = 'A':'Z'
'A':1:'Z'

julia> a[ceil(Int, rand() * length(a))]
'K'

versus (if you are unlucky)

julia> a[ceil(Int, (1-rand()) * length(a))]
ERROR: BoundsError: attempt to access 26-element StepRange{Char,Int64} at index [0]

In Julia, you’d write rand(a) rather than a[ceil(Int, rand() * length(a))], so this is more of a problem in other languages.

3 Likes

Or if one absolutely wants the index, rand(eachindex(a)).

1 Like

The 1/x case (mentioned somewhere above) is interesting, because it is not sufficient to simply filter out 0.0. If the zero is important enough to matter, then the interval between zero and eps(T) is probably also important, but rand(T) will only return multiples of eps(T).

I think it’d be fairly straightforward to write a rand function which might return any representable float in [0,1], so that the probability of getting a zero is equal to nextfloat(zero(T)), i.e. 5.0e-324 for Float64. A naïve implementation would run at about half the pace of the standard random number generator, but it might be possible to make it almost as fast.

1 Like

Returning subnormals from rand is likely to slow things down on many microprocessors.

Are you distinguishing “other important bits” from other bits in unused portions of memory?

Only about one in 10^308 returned numbers would be subnormal, so it would not affect performance in any measurable way. (Of course, that’s also a reason not to bother with returning subnormal numbers if it requires any extra coding effort.)

For most practical problems I can think of where 1/x is needed, either one uses a bounded f(1/x) (so that the expectation is finite) and/or the method is ill-behaved anyway so that one would use a transformation. I think that it is important to consider only examples where expected values a finite and preferably second moments also exist, otherwise MC is a dubious proposition anyway.

But that’s a red herring: if think the Julian solution is to implement a sampler for a custom Open01 object, and then the semantics are clear. This could of course live in a package.

1 Like

No… This is a simplified example how an index error might happen. Not an example how to draw a random element from a vector (yes, you wouldn’t do it this way.)

Another possibility for getting (0,1] is

openclosed(t...) = (r = rand(t...); iszero(r) ? one(r) : r)

which simply switches out the zero value with one.

1-rand() will probably be faster.

3 Likes
julia> @btime openclosed()
  3.848 ns (0 allocations: 0 bytes)
0.33216440795825175

julia> @btime y() # 1-rand(t...)
  3.657 ns (0 allocations: 0 bytes)
0.04763324146749448

How do the timings compare for openclosed(n) and y(n) where n is 100, 1_000, 10_000?

y(n) errors out, while openclosed(100) is valid code

Ideally we could have a RandomMode like we have a RoundingMode, no?

Like others said, yes, and this definition doesn’t apply only to Float64, so it’s not only theoretical. For example, the array rand(Float16, 2^12) has a pretty good chance to contain 0.

One pretty efficient solution to generate in (0, 1) is to set the least significant bit to 1 (which is always set to 0 by default), e.g. reinterpret(Float64, reinterpret(UInt64, rand(rng, Float64)) | 0x0000000000000001). I believe this should keep the distribution mostly uniform. This is what the RandomExtensions.jl package does (it provides few related distributions, like OpenOpen, CloseOpen etc.).

3 Likes

From that point of view then the best choice would be [0,1] not [0,1) because it is easier to discard 1 and 0.

It would be easier to discard 1 and 0, but it has other advantages to let vanilla rand() fit into a half open interval.

As I said above, it would probably make sense to customize this with just different samples — there is no universal best choice.

That said, if you look at the source, drawing a float in [0, 1) can be done with a some rather elegant bit-fiddling, so in that sense it is rather special and reasonable as a default.

2 Likes