# Computation precision with Float

I see strange behavior for the computation of a very simple formula:

``````julia> rounding(Float64)
RoundingMode{:Nearest}()
julia> p=0.8
0.8
julia> 1-p
0.19999999999999996
``````

I have the same result with Julia 1.8.0 on Windows10 or with Julia 1.8.1 on Linux.
Is that the expected behavior?
Even stranger:

``````julia> q=0.2
0.2
julia> 1-q
0.8
``````

Is there a well known way to manage these rounding errors or should I systematically make rounding adjustments in my code? Is the “Nearest” method of rounding the most appropriate?
Rational numbers seem to behave better than Float.

``````julia > p = Rational(8,10)
4//5
julia> 1-p
1//5
julia> q = Float64(1-p)
0.2
``````

In the specific kind of problem I am working on, I will not use numbers with absolute value larger than 1.
Any recommendation?

The `BigFloat` operations can be customized with a `setprecision` and `setrounding` .
see: → Arbitrary Precision Arithmetic

You might be looking for https://0.30000000000000004.com .

Inserting rounding at intermediate steps of the calculation is likely to reduce accuracy, unless you have reason to believe all intermediate steps will be nice numbers in decimal. In which case you may want something like DecFP.jl or Decimals.jl or FixedPointNumbers.jl.

1 Like

Is the speed of calculation for BigFloat the same as for Float64?
I am not so much looking for many decimals. But I’d like exact results when dealing with simple numbers.

As pointed, Rational numbers seem better for my case but I’m not sure speed is comparable to Float64. Further, since I am doing trigonometric functions on these numbers, I am not sure that these imply a hidden conversion to Float64.

Thanks for the pointers. I’ll look at that.

1 Like

This is the expected behaviour of floating point representations. The problem being that decimals `0.2` and `0.8` cannot be represented exactly in the binary float format of `Float64` (IEEE 754 standard). The link provided by @mcabbott is a good starting point. For all the glorious details click on " What Every Computer Scientist Should Know About Floating-Point Arithmetic" on that page.

``````julia> rationalize(0.2; tol=0)  # Convert to rational exactly (must be possible as Float64 precision is finite)
3602879701896397//18014398509481984

julia> rationalize(1 - 0.8; tol=0)
900719925474099//4503599627370496
``````

Rational numbers are exact using integers as numerator and denominator. Yet, analytic functions such as `sin` or `exp` are not necessarily rational on rational arguments, i.e., can only be approximated on a computer anyways.

``````julia> x = 2//10
1//5

julia> (x.num, x.den)
(1, 5)

julia> rationalize(Float64(x); tol=0)  # Float64 cannot represent 0.2 = 2/10 exactly
3602879701896397//18014398509481984
``````

In any case, computers are restricted to a countable subset of the real numbers, namely the computable numbers. Had once seen a Common Lisp implementation of such where each number was represented by a function that could be run to provide an approximation at the desired precision. Efficiency of such a representation is very low and not many use cases require such fine control about the precision of approximations.

1 Like

If they were, we would not need Float64 at all I bet they are at least one to two orders of magnitude slower for most things

3 Likes

A brief overview of the issues:

3 Likes

in the `DoubleFloats.jl` ( GitHub - JuliaMath/DoubleFloats.jl: math with more good bits ) there is a comparison: `Double64` vs. `Float128` vs. BigFloat 128 precision

so the `Double64` is faster than the `BigFloat`

Decimals are an important tool in finance math because they allow for more precise and accurate calculations of monetary values.

``````julia> using FixedPointDecimals

julia> p=FixedDecimal{Int,6}(0.8)
FixedDecimal{Int64,6}(0.800000)

julia> 1-p
FixedDecimal{Int64,6}(0.200000)

julia> print(1-p)
0.200000
``````

And see the already mentioned " DecFP.jl or Decimals.jl or FixedPointNumbers.jl."

You can find other “float” or “decimal” related packages at the JuliaHub