Is it safe to compare rounded float values for equality?

I know that comparing two floats for equality using == is not a good idea, as it can yield unexpected results. For example, 0.1 + 0.2 == 0.3 returns false. I understand that there is an isapprox function to deal with this. My question is the following. Is it safe to compare two float values if they have been rounded to the same number of digits?

For example, round(0.1+0.2, digits=2) == round(0.3, digits=2) returns true. Can I expect this to always work as long as the two floating point numbers are rounded to the same two digits?

Thank you.

You can just do this

julia> epsilon = 1.0e-10
1.0e-10

julia> if abs( (0.1 + 0.2) - 0.3 ) <= epsilon
       println("A and B are equsl")
       else
       println("A and B are NOT equal")
       end
A and B are equsl

You can create your own function

julia> function IsEqual(A::Float64,B::Float64,epsilon::Float64=1.0e-10)
           if abs( A - B ) <= epsilon
               return true
           else
               return false
           end
       end

julia> IsEqual(0.1 + 0.2 , 0.3)
true

julia> println("Just make sure epsilon is bigger than ", abs( (0.1 + 0.2) - 0.3 ) )
Just make sure epsilon is bigger than 5.551115123125783e-17

Yes, I believe that should be safe. I’m curious why you’d want to do this though, and not use:

julia> 0.1 + 0.2 ≈ 0.3
true

Or, for more control over the tolerance used, use isapprox. For example:

julia> isapprox(0.1 + 0.2, 0.3; atol = 1e-2)
true
4 Likes

Ooofff. I suppose you mean:

function IsEqual(A::Float64,B::Float64,epsilon::Float64=1.0e-10)
    return abs( A - B ) <= epsilon
end

:wink:

1 Like

Actually, it’s not safe…

julia> round(1e30 + 2e30; digits=2) == round(3e30; digits=2)
false
2 Likes

I’m guessing that what you mean by “safe” is Would it give the same answer as it would in exact (infinite-precision) arithmetic? The answer, of course, is “it depends”, but the most general answer is “no”.

That is, suppose you are comparing two numbers x and y that are computed by two different floating-point algorithms, and you want a comparison function is_same(x,y) that returns true if you would have x==y in infinite precision.

Suppose that you your algorithms are accurate to 8 significant digits. Then you could do isapprox(x, y, rtol=1e-8). Or you could do round(x, sigdigits=8) == round(y, sigdigits=8), which is almost equivalent but much slower (about 100× slower on my computer!).

Of course, to do this, you need to have a rough sense of the accuracy of your algorithms. If it is a single scalar operation like 0.1 + 0.2, then it should be accurate to nearly machine precision, but for more complicated algorithms error analysis is much tricker. The default in isapprox (the operator) is to compare about half of the significant digits in the current precision, which is reasonable for many algorithms (losing more than half of the significant digits means you have a pretty inaccurate calculation), but is obviously not universally appropriate.

Naturally, be aware that such approximate comparisons may give false positives (returning true for two values that are supposed to be distinct in infinite precision, but differ by a very small amount).

Your suggestion, round(x, digits=8) == round(y, digits=8), is roughly equivalent to (but vastly slower than) isapprox(x, y, atol=1e-8) — an absolute tolerance rather than a relative tolerance. Usually, a relative tolerance is more appropriate in floating-point calculations, because relative tolerances are scale invariant.

If you want a rigorous guarantee that two values might be the same, you can use Interval Arithmetic and implement might_be_same(x,y) = !isdisjoint(x,y). This might give you false positives, but will never give false negatives.

9 Likes

The function you wrote is equivalent to calling isapprox with the atol keyword. As I mentioned above, however, typically a relative tolerance (rtol) is more appropriate.

No (unfortunately I speak from experience where I worked on a project that did this). The problem is that two nearby values can round in opposite directions, e.g.

julia> round(0.014999999999999, digits=2)
0.01

julia> round(0.015000000000000, digits=2)
0.02

Now that might seem unlikely, but if you test enough values, it will probably happen. e.g. if the values are accurate up to 1e-8, and you check to 2 decimal places, the probability of it happening for any one test is 1 in million:

julia> 1e-8/1e-2
1.0e-6

But if you’re testing say a 500*500 matrix then you would see it happen ~22% of the time:

julia> 1-(1-(1e-8/1e-2))^(500*500)
0.22119931428435058
4 Likes

Thanks very much for all the replies. I need to read some more carefully to digest all the information. The reason I was trying to avoid using isapprox is that I want to use these values as keys in a dictionary, and was wondering if I could use the
get(d, k, 0) to get the value for key k (where k is a float). I was thinking of using rounded floats as keys. Using isapprox would require me to explicitly loop through the keys to test whether the key exists. But it seems that is the safer way. Essentially I am trying to get by with Floats when I should be using decimals (via Decimals package).

Thanks again.

Any bucketing of floating-point numbers into more than one bucket will have the property that there are values which only differ in the last bit yet are in different buckets.

4 Likes

For quick and dirty code, you can use

julia> _round_tozero(r::Float64, sigdigs) = reinterpret(Float64, reinterpret(UInt64,r) & (-1<< (63&(52-sigdigs))))
julia> _round_fromzero(r::Float64, sigdigs) = reinterpret(Float64, reinterpret(UInt64,r) | ~(-1<< (63&(52-sigdigs))))

I am sure there is a way of expressing this using julia’s built in rounding modes. The above is definitely close to the fastest possible code, with unexpected gotchas like isnan(_round_fromzero(Inf, 3)) or _round_tozero(NaN,0)==Inf (both are arguably wrong).

To be specific, I would say yes to this question. The problem occurs when the two numbers are not rounded to the same two digits.

Thanks for clarifying, that helps. As you say, one option is to use Decimals for this. Another would be to use a sorted dictionary, and find the closest mapping, and see if that’s within your accepted tolerance. If your dictionary has many elements, this should be faster than looping over them all, since it has logarithmic instead of linear complexity in the number of elements. For example, here’s your current problem using SortedDict:

julia> using DataStructures

julia> d = SortedDict{Float64, String}();

julia> d[0.1 + 0.2] = "foo";

julia> d[0.3]
ERROR: KeyError: key 0.3 not found

Now define:

function closestmapping(d::SortedDict{K,V}, k::K) where {K<:Real, V}
    t1 = searchsortedlast(d,k)
    t2 = advance((d,t1))
    m = map(t -> deref((d,t)), Iterators.filter(t -> status((d,t)) == 1, (t1,t2)))
    reduce((a,b) -> abs(k - a[1]) < abs(k - b[1]) ? a : b, m)
end

Then you could do:

julia> c = closestmapping(d, 0.3)
0.3 => "foo"

julia> c[1] - 0.3
5.551115123125783e-17

(Of course, if your algorithm is only accurate to say 8 digits, and you have multiple keys within 1e-8 of each other, this approach won’t work.)