(quick notes)
If you are only concerned with floating point values that are representable within a given type, e.g. either Float32 or Float64, and you ignore possible overflow and possible underflow (so you are choosing to work with only those xs, ys where x and y and (x/y) and (x/y)*y are nonzero, finite, and not subnormal), then your request is easier to think through. And for simplicity of language, let’s assume x > 0.0 && y > 0.0
.
The result of an arithmetic operation (+, -, *, /) on an IEEE Standard floating point type (e.g. Float16, Float32, Float64) will be either equal to, or within +/- 1 least significant bit of the ideal result. Another way to understand this is through the results obtained when RoundUp and RoundDown are used.
At least one of arithop(x, y, RoundDown) and arithop(x, y, RoundUp) will be equal to arithop(x, y, RoundNearest). If the result is exactly representable in the given type 0.25 + 0.75
, then all three will give the same result. If all three are equal then you know the value of the largest z
where z * y < x
is prevfloat(x/y)
and the value of the largest z
where z * y <= x
is x/y
.
If the RoundUp and RoundNearest results are the same, then you know z
where z * y < x
is the result of RoundDown, which will be prevfloat(x/y). If the RoundDown and RoundNearest results are the same, then you know z
where z * y <= x
is (x/y).
julia> x,y
(2.8020393306553872, 1.0303474844527714)
julia> roundnearest_for_z = x/y
2.7195090713921437
julia> rounddown_for_z = divdown(x,y)
2.7195090713921433
julia> roundnearest_for_z * y == x && rounddown_for_z * y == x
false
julia> roundnearest_for_z * y <= x && rounddown_for_z * y <= x
true
julia> roundnearest_for_z > rounddown_for_z
true
julia> roundnearest_for_z * y < x, rounddown_for_z * y < x
(false, true)
and you may find this surprising (same x,y) … this is why the condition z * y <= x
is likely to be mishandled when the answer sought really needs to meet the test when used in different ways.
julia> roundnearest_for_z * y - x
0.0
julia> fma(roundnearest_for_z, y, -x)
2.1622281971614692e-16