Largest float `z` such that `z*y <= x`

e3c6 · November 7, 2018, 3:45pm

Given floating-point numbers x,y, is there a way to obtain the largest floating point number z such that z * y <= x? Hopefully in a single operation. I tried z = prevfloat(y/x), but I’m not 100% sure this works in every case, and it involves two operations instead of one. Since we are at it, can I also get the largest z such that z * y < x (now with strict inequality)?

This is not the same as float division, x/y. Due to rounding error we could have (x/y) * y greater or smaller than x.

For simplicity let’s assume that x,y > 0.

Tamas_Papp · November 7, 2018, 4:05pm

EDIT the solution below is incorrect; as Julia 1.0 no longer has this functionality. See discussion below.

~~I am hoping~~

function divdown(x::T, y::T) where T <: AbstractFloat
    @assert x > 0 && y > 0 # other cases left as an exercise for the reader
    setrounding(() -> x/y, T, RoundDown)
end

~~would work but make sure to test it~~

JeffreySarnoff · November 7, 2018, 5:55pm

(quick notes)

If you are only concerned with floating point values that are representable within a given type, e.g. either Float32 or Float64, and you ignore possible overflow and possible underflow (so you are choosing to work with only those xs, ys where x and y and (x/y) and (x/y)*y are nonzero, finite, and not subnormal), then your request is easier to think through. And for simplicity of language, let’s assume x > 0.0 && y > 0.0.

The result of an arithmetic operation (+, -, *, /) on an IEEE Standard floating point type (e.g. Float16, Float32, Float64) will be either equal to, or within +/- 1 least significant bit of the ideal result. Another way to understand this is through the results obtained when RoundUp and RoundDown are used.

At least one of arithop(x, y, RoundDown) and arithop(x, y, RoundUp) will be equal to arithop(x, y, RoundNearest). If the result is exactly representable in the given type 0.25 + 0.75, then all three will give the same result. If all three are equal then you know the value of the largest z where z * y < x is prevfloat(x/y) and the value of the largest z where z * y <= x is x/y.

If the RoundUp and RoundNearest results are the same, then you know z where z * y < x is the result of RoundDown, which will be prevfloat(x/y). If the RoundDown and RoundNearest results are the same, then you know z where z * y <= x
is (x/y).


julia> x,y
(2.8020393306553872, 1.0303474844527714)

julia> roundnearest_for_z = x/y
2.7195090713921437

julia> rounddown_for_z = divdown(x,y)
2.7195090713921433

julia> roundnearest_for_z * y == x && rounddown_for_z * y == x
false

julia> roundnearest_for_z * y <= x && rounddown_for_z * y <= x
true

julia> roundnearest_for_z > rounddown_for_z
true

julia> roundnearest_for_z * y < x,  rounddown_for_z * y < x
(false, true)

and you may find this surprising (same x,y) … this is why the condition z * y <= x is likely to be mishandled when the answer sought really needs to meet the test when used in different ways.

julia> roundnearest_for_z * y - x
0.0

julia> fma(roundnearest_for_z, y, -x)
2.1622281971614692e-16

Elrod · November 7, 2018, 6:16pm

For me, divdown only works with BigFloats?

julia> function divdown(x::T, y::T) where T <: AbstractFloat
           @assert x > 0 && y > 0 # other cases left as an exercise for the reader
           setrounding(() -> x/y, T, RoundDown)
       end
divdown (generic function with 1 method)

julia> divdown(5., 2.)
ERROR: MethodError: no method matching setrounding(::Type{Float64}, ::RoundingMode{:Down})
Closest candidates are:
  setrounding(::Type{BigFloat}, ::RoundingMode) at mpfr.jl:800
Stacktrace:
 [1] setrounding(::getfield(Main, Symbol("##5#6")){Float64,Float64}, ::Type{Float64}, ::RoundingMode{:Down}) at ./rounding.jl:168
 [2] divdown(::Float64, ::Float64) at ./REPL[1]:3
 [3] top-level scope at none:0

julia> divdown(big(5.), big(2.))
2.5

julia> versioninfo()
Julia Version 1.0.2-pre.0
Commit 4aea3d2b7c (2018-09-30 15:02 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: AMD Ryzen Threadripper 1950X 16-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, znver1)

JeffreySarnoff · November 7, 2018, 6:16pm

pkg> add SetRounding
using SetRounding

Tamas_Papp · November 7, 2018, 7:53pm

It’s been a while since I used this, but shouldn’t this line have setrounding_raw?

JeffreySarnoff · November 7, 2018, 9:40pm

It wants to be doing this afaic


function setrounding(f::Function, ::Type{T}, rounding::RoundingMode) where T
    old_rounding_raw = rounding_raw(T)
    new_rounding_raw = to_fenv(rounding)
    setrounding_raw(T, new_rounding_raw)
    try
        return f()
    finally
        setrounding_raw(T, old_rounding_raw)
    end
end

@simonbyrne, @dpsanders does this need to a PR, (see Tamas’s link), if so I’ll do it – if not, we would like to understand how it works properly.

simonbyrne · November 8, 2018, 6:09am

My best guess would be

z = x/y
if fma(z,y,-x) > 0
    z = prevfloat(z)
end

simonbyrne · November 8, 2018, 6:11am

No, the use of rounding_raw/setrounding_raw was just to ensure type stability. That isn’t a problem in that particular case.

Tamas_Papp · November 8, 2018, 6:29am

so

julia> setrounding(Float64, RoundDown)
ERROR: MethodError: no method matching setrounding(::Type{Float64}, ::RoundingMode{:Down})

is now the expected behavior? I must have missed something.

dpsanders · November 8, 2018, 6:48am

setrounding for Float64 and Float32 was removed from Base, since it was unreliable, due to the way that LLVM does certain optimizations.

The relevant code was extracted to https://github.com/JuliaIntervals/SetRounding.jl.

Tamas_Papp · November 8, 2018, 7:05am

Thanks. My solution above should be considered incorrect then.

e3c6 · November 8, 2018, 1:40pm

This is the recommended approach then? Too bad it requires two ops.

bkamins · November 8, 2018, 2:19pm

You need to handle negative values separately as this covers x and y that are positive. The rule above fails e.g. for x=-2 and y=3.

simonbyrne · November 8, 2018, 4:46pm

The longer version is that we need a way to substitute LLVM floating point operations to use constrained floating point intrinsics. My thinking at the moment is to use a Cassette context to do this, though haven’t had a chance to try it out yet.

Topic		Replies	Views
BigInt precision issue General Usage	3	781	October 14, 2018
No overflow nor underflow rounding mode for Floats Numerics	6	898	August 19, 2018
A question regarding BigFloat General Usage question	11	1845	November 5, 2018
How come integer division 6.0 by 0.2 returning 29 is correct? General Usage question	19	2144	October 22, 2018
Comparing floating point numbers in an intuitive way fails within julia Internals & Design	6	1258	December 7, 2018

Largest float `z` such that `z*y <= x`

Related topics