Calling rand(Float64)
will generate a random number in the range [0, 1)
. However, I want to draw random numbers from all possible Float64
values. Does anyone know of a good way to do this? I feel like I’m just overlooking something basic.
reinterpret(Float64, rand(UInt64))
will include NaNs
this will exclude NaNs and include Infs
function randomfloat()
result = NaN
while isnan(result)
result = reinterpret(Float64, rand(UInt64))
end
result
end
this will exclude NaNs and Infs
function randomfloat()
result = NaN
while isnan(result) || isinf(result)
result = reinterpret(Float64, rand(UInt64))
end
result
end
With what distribution?
A basic issue here is that the Float64
values are not uniformly distributed, because the floating-point number line looks like this:
so simply reinterpreting a random UInt64
will not be uniformly distributed in \mathbb{R}.
You could do rand(-1:2:+1) * rand(Float64) * floatmax(Float64)
, to draw uniformly over roughly the full range, but you won’t technically get “all possible Float64
values” this way.
Your responses have been very helpful!
In this case, I’m interested in examining how hard it would be to create something like go’s fuzzing tool in Julia. Right now I’ve only created the dumbest, simplest possible utility function.
function mutation_engine(c::Channel, args...)
# Can't do strings for now
any(x -> isa(x, String), args) && throw(ArgumentError("Can't produce random strings yet"))
# Gather the types of each argument
types = typeof.(args)
# Generate random values of each type for as long as necessary
while true
put!(c, rand.(types))
end
end
I then realized that I was only generating Float64
s in the range [0, 1)
and actually want to examine behavior over all possible Float64
s. It’s clear that I’d need to put in a lot more thought about how such a tool would work. Would users want all possible Float64
values including Inf
and NaN
? Would they want them evenly distributed over the reals? ¯\(ツ)/¯
Anyway, thanks so much for the help!
For fuzzing, Inf and NaNs would be very interesting. And as for the rest of the numbers, distribution should perhaps concentrate on some definition of problematic numbers.
As an interesting distribution, one could imagine a distribution with uniform popcount (for Ints and reinterpreted Float64s). So in that case, how would one sample such a distribution efficiently?
I’ve actually thought about optimal distributions for sampling Floats for testing numeric functions. IMO the optimal would be something like:
- 1% chance of Infs/Nans/0.0/-0.0
- 29% chance of uniform subnormals and near subnormals
- 30% chance of very large exponents (i.e. > sqrt(floatmax)
- 40% chance of exponentially distributed “normal” numbers i.e. between 1e-3 and 1000
Perhaps, there is no point of even randomizing for these. Fuzzing should go over a must-check list including these and maybe instance specific previously flagged problem numbers, and source derived numbers and then go on to more random choice fuzzing.
Can you explain more what you mean by “source derived numbers”?
For example, if a function contains:
if importantParam < 0.75
doSomething()
end
then 0.75
, prevfloat(0.75)
, nextfloat(0.75)
would be reasonable ‘source derived numbers’.