Julia equivalent of Python's "fsum" for floating point summation

stevengj · July 13, 2019, 2:50am

I put together a Julia wrapper for his C code at:

I was too lazy to re-implement it from scratch, but I think this should be sufficient for most needs: it provides an xsum function that does exact double-precision (Float64 and ComplexF64) accumulation over arbitrary iterators (with an optimized method for contiguous arrays).

It seems to be quite fast — for an array of 10⁴ elements, it is actually about the same speed as a naive sum without SIMD. With @simd (which is enabled for the built-in sum) it is about 10x slower, since Neal’s code does not seem to exploit SIMD.

julia> function naivesum(a)
           s = zero(eltype(a))
           for x in a
               s += x
           end
           return s
       end
mysum (generic function with 1 method)

julia> a = randn(10^4);

julia> using Xsum, BenchmarkTools

julia> @btime sum($a);
  1.107 μs (0 allocations: 0 bytes)

julia> @btime xsum($a);
  12.272 μs (0 allocations: 0 bytes)

julia> @btime naivesum($a);
  12.175 μs (0 allocations: 0 bytes)

julia> err(s, exact) = abs(s - exact) / abs(exact) # relative error
err (generic function with 1 method)

julia> exact = Float64(sum(big.(a))); err(sum(a), exact), err(xsum(a), exact), err(naivesum(a), exact)
(1.5635496798094023e-16, 0.0, 6.097843751256669e-15)

On my machine, Neal’s “large” accumulator starts being faster than the “small” accumulator around n=256.

Topic		Replies	Views
Speed issue with KahanSummation Performance question , package , parallel	12	1246	February 5, 2022
Accurate summation algorithm Numerics	34	10088	December 1, 2017
Optimizing sums of products (dot products) Performance simd , linearalgebra , sum	17	626	September 24, 2024
Sum of float64 vector gives slightly incorrect answer Performance question	44	5080	April 29, 2019
Julia speed Performance question	28	2274	March 14, 2024

Julia equivalent of Python's "fsum" for floating point summation

Related topics