Julia equivalent of Python's "fsum" for floating point summation

moustachio-belvedere · November 20, 2018, 5:16pm

I have been looking for a Julia equivalent of Python’s fsum and have thus far been unable to find one. Is there an implementation already available? If not I will write my own and put it on Github for others to use.

From the Python docs:

math.fsum ( iterable )

Return an accurate floating point sum of values in the iterable. Avoids loss of precision by tracking multiple intermediate partial sums:

The algorithm’s accuracy depends on IEEE-754 arithmetic guarantees and the typical case where the rounding mode is half-even. On some non-Windows builds, the underlying C library uses extended precision addition and may occasionally double-round an intermediate sum causing it to be off in its least significant bit.

For further discussion and two alternative approaches, see the ASPN cookbook recipes for accurate floating point summation.

stillyslalom · November 20, 2018, 5:26pm

There was once a Kahan summation function in Base, but it got split off to a package prior to 1.0: GitHub - JuliaMath/KahanSummation.jl: Sum and cumulative sum using the Kahan-Babuska-Neumaier algorithm

Tero_Frondelius · November 20, 2018, 5:26pm

Maybe this is relevant: https://github.com/dpsanders/IntervalArithmetic.jl

stevengj · November 20, 2018, 6:15pm

My understanding is that Python’s fsum function doesn’t use Kahan summation, it uses an algorithm by Shewchuk that guarantees an exactly rounded result (i.e. the result as if you did the sum in infinite precision and then rounded it to the closest floating-point value).

Kahan summation, in contrast, doesn’t guarantee an exactly rounded result, but its error (relative to the L₁ norm of the input) is bounded independent of the number of terms. The default sum function in Julia is also quite accurate because it uses pairwise summation, with an error bound that grows only as the log of the number of terms.

(My understanding is that Shewchuk’s algorithm is especially useful if you want a small relative error even for ill-conditioned sums: sums of both positive and negative values where the result is nearly zero due to cancellations. I’ve never used the algorithm myself, though.)

StefanKarpinski · November 20, 2018, 8:19pm

See also Radford Neal’s paper on exact summation with superaccumulators:

If someone wants to implement exact summation, it might be worth comparing different approaches.

stevengj · November 20, 2018, 9:30pm

Someone may want to contact the author to ask for the C code under an MIT or BSD license, as otherwise translating the code from the paper may be problematic.

StefanKarpinski · November 20, 2018, 10:19pm

Or implement the algorithms without copying the code.

JeffreySarnoff · November 21, 2018, 5:56am

I wrote Radford, and asked about use of the C code under the MIT or BSD License.

Sukera · November 21, 2018, 6:32am

There’s also the possibility of one person transpiling the code from the paper to pseudo code and someone else implementing the algorithm based solely on the pseudo code (without reading the original implementation!) - that way, no explicit copying is going on, which is the part that would be problematic.

I’ve not read the paper yet, so I’d be happy to do the second part sometime next week if someone can provide a pseudo code version of this.

StefanKarpinski · November 21, 2018, 2:52pm

That’s the most paranoid version of GPL avoidance, but it’s not really legally necessary. As long as you don’t derive your code from GPL code, you can license your own code however you want. Of course the question is what counts as “deriving” your code from GPL code? Do you have to explicitly copy or translate it? Or can you be subconsciously influenced by GPL code you’ve seen but not consciously copied?

Sukera · November 21, 2018, 2:58pm

Not my idea, I’ve only described the technique as told here - as far as I understand, any derivation of GPL’d code has to be GPL’d itself. By describing the algorithm with various sources without directly translating it to make it as easy as possible to reimplement it, it’s possible to (somewhat) circumvent this.

IANAL though, so this might just be bonkers

moustachio-belvedere · November 21, 2018, 3:49pm

Thanks for the feedback everyone — looks like I have plenty of alternatives to choose from at the moment.

The default sum function in Julia is also quite accurate because it uses pairwise summation

I didn’t realize the regular sum function uses pairwise summation, this will probably be adequate for my present requirements. I need to hand in a PhD thesis in the 2 months time so it sounds like someone else will get to implementing Shewchuck’s (or Neal’s) algorithm before me. If not I will look at it again at that time.

EDIT : Out of curiosity, would this fsum equivalent belong in the stdlib or somewhere else?

StefanKarpinski · November 21, 2018, 4:36pm

Please do not do that. Translating GPL code to pseudocode that “looks suspiciously like Python” and then having someone else translate that pseudocode into actual Python (or in this case Julia) is far more legally dicey than just reading the paper and implementing the algorithm based on the description while disregarding the GPL code that’s included. Translating twice does not break the “derived work link”—you can translate through as many different programming languages as you want and it’s still a derived work.

StefanKarpinski · November 21, 2018, 6:04pm

What better time for procrastination?

stevengj · November 21, 2018, 10:29pm

(Note that the published code by Radford is not GPL—in fact, it has no license statement at all, which means that you cannot redistribute it, or code derived from it, in any form. Clean-room re-implementation is not about “GPL avoidance,” it’s about working around copyright law’s definition of “derived work.” In this case, the paper itself contains code excerpts, so it is especially tricky to re-implement the algorithm in a clean way from a copyright perspective.)

The author emailed me to say he was planning on posting a cleaned-up version of the code under a free/open-source license (his initial impulse was GPL, but I’ve urged him to consider a more permissive license for a small library routine like this). Not sure about the timeframe.

stevengj · November 22, 2018, 3:37pm

5 posts were split to a new topic: Copyright issues for code excerpts

y4lu · November 22, 2018, 9:03am

A Kahan / Pairwise polyalgorithm seems like it could still be pretty fast, it might be +15*3 operations for summing 256 numbers depending on where you switch. Not sure how much extra precision you’d get

StefanKarpinski · November 22, 2018, 4:58pm

Pairwise summation is the default in Julia; it is just as fast as naive summation and much more accurate. Kahan summation is even more accurate but significantly slower, which is why it is not the default.

y4lu · November 23, 2018, 7:58am

It’s a pretty neat trick too, and definitely a hard default to argue against. I was just noticing that each pass of the pairwise sum makes a partial Kahan sum on the remaining values 50% cheaper

Another approach

function finiteDif_sum(arr, d= 0.0)
  x = 0.0; y=0.0;
  y += d;
  for i in arr
    x = x + x-y + d + i;  ## +y-x -d ?
    y = y + i + y + i - x - d; ## and + x - y - i + d ?
    end;
  [x,y - d]
  end;

Add- might’ve gotten the x-y around the wrong way

JeffreySarnoff · November 26, 2018, 1:14am

I have posted a copy of Radford’s paper with all C code removed.
https://github.com/JeffreySarnoff/SuperaccumulatorPaperNoC/blob/master/superaccumulator.pdf

Topic		Replies	Views
"sum([1.0, 1e100, 1.0, -1e100])" fails General Usage question , numbers	13	1495	February 16, 2022
Accurate summation algorithm Numerics	34	10010	December 1, 2017
Kahan summation in `sum`? General Usage float , precision	13	1198	December 29, 2024
Speed issue with KahanSummation Performance question , package , parallel	12	1222	February 5, 2022
Array ordering and naive summation Numerics examples	5	5645	May 13, 2023

Julia equivalent of Python's "fsum" for floating point summation

Related topics