"sum([1.0, 1e100, 1.0, -1e100])" fails

I’ve seen a tweet from David Amos explaining this case where sum in Python fails miserably. The example is taken from BINARY FLOATING POINT SUMMATION ACCURATE TO FULL PRECISION (PYTHON RECIPE).

Julia fails too:

julia> sum([1.0, 1e100, 1.0, -1e100])
0.0

What would be the easiest way to avoid this error in Julia? In Python, fsum gives the correct answer.

1 Like
float(sum(Rational.(BigFloat[1.0, 1e100, 1.0, -1e100])))
2.0
4 Likes
julia> sum(sort([1.0, 1e100, 1.0, -1e100],by=abs,rev=true))
2.0

(“easiest” here may mean many things… these are just tricks to get the idea of floating point summation. If that was an actual problem, one would need to check exactly what is the purpose of the calculation to see what to do)

4 Likes

Relevant?

5 Likes

A third one if the problem are supposed to have “nice” integer numbers, just use BigInts.

sum([1, BigInt(10)^100, 1, -BigInt(10)^100])
2 Likes

that’s kind of cheating… might as well just use more precision?

julia> setprecision(800) do # you can do this globally
           sum(BigFloat[1.0, 1e100, 1.0, -1e100])
       end
2.0
6 Likes

Try

using KahanSummation

sum_kbn([1.0, 1e100, 1.0, -1e100])

which was earlier in Base IIUC. Related thread.

Edit: you’ll also find AccurateArithmetics and the interesting task to extend error-free transformations.

9 Likes

The direct analogue of fsum in Julia is Xsum.jl, not Kahan/compensated summation (which is very accurate, but is not exactly rounded in general).

9 Likes

This looks like it is working with a double precision compensated sum?

In general, this is not a language issue. Python is not “failing”, nor is Julia. You’ll get exactly the same thing if you just use + in any language using double-precision floating-point arithmetic (the default precision in most languages).

As another example, sum([1e-100, 1.0, 1e-100, -1.0]) also gives 0.0 — it is exactly the same problem, just rescaled, in which the exactly rounded answer is 2e-100. Is this a big error or a small error?

The key question is, compared to what.

  • It is a big error (100%) compared to the correct sum (i.e. the relative forward error). That’s because we are computing an ill-conditioned function (a sum with a catastrophic cancellation), so the relative error in the output is large even for tiny roundoff errors. You have to be careful when doing anything ill-conditioned in finite precision.

  • It is a small error compared to the magnitude of the inputs, i.e. the typical scale of the problem we are giving it (1e100 in your example, 1.0 in mine). A more precise way of saying this is that the “backwards relative error” is small.

Summation is the easiest case to analyze, and the easiest for which you can write exactly rounded algorithms like fsum or xsum that do the computation in effectively infinite precision, as well as nearly perfect algorithms like compensated summation. But this simplicity is a two-edged sword — it also means that summation perhaps gets disproportionate attention, when in fact ill-conditioning and similar problems are something you have to be aware for anything in finite-precision arithmetic.

20 Likes

Agreed. But this could be about language guarantees like speed vs. accuracy (discussions, which we had before about for example checked integer arithmetic or signalling NaN’s).

People have mentioned this problem before.

This is NOT a problem with the programming language.

This is the LIMITATION of the Float64 binary floating point representation of REAL NUMBERS

There is only so much precision in Float64, if you add a small number with a big number, you can only represent the result with so many precision. So you start to lose precision if you exceed the limited precision of Float64.

You will have exactly the same problem with BigFloat too if you exceed its precision as shown below

julia> sum([big"1.0", big"1e20", big"1.0", big"-1e20"])
2.0

julia> sum([big"1.0", big"1e40", big"1.0", big"-1e40"])
2.0

julia> sum([big"1.0", big"1e80", big"1.0", big"-1e80"])
0.0

Even my own decimal floating point module has LIMITATIONS

julia> using DFPs # This is my own personal module
[ Info: Precompiling DFPs [top-level]

julia> DFP_setDefaultPrecision(10_000)
10000

julia> DFP_setShowStyle(:short) # don't show 10,000 digits in the results
:short

julia> sum([d"1.0", d"1e80", d"1.0", d"-1e80"])
2.00000

julia> sum([d"1.0", d"1e9999", d"1.0", d"-1e9999"])
2.00000

julia> sum([d"1.0", d"1e10000", d"1.0", d"-1e10000"])
0.00000
3 Likes

No, it could not.

There is no practical way for any language to guarantee exactly rounded results from all possible sequences of floating-point computations. (And there’s a questionable benefit to doing this for only a tiny number of functions like sum.)

This is very different from the issue of checking for integer overflow, which might indeed be practical in some cases (or maybe someday in all cases) because it is a purely local check on an individual elementary operation like +. In an ill-conditioned computation, in contrast, every individual step can be correctly rounded but the final answer can be completely wrong.

8 Likes

just use more precision:

you’re just exceeding the DEFAULT precision