Julia equivalent of Python's "fsum" for floating point summation

I put together a Julia wrapper for his C code at:

I was too lazy to re-implement it from scratch, but I think this should be sufficient for most needs: it provides an xsum function that does exact double-precision (Float64 and ComplexF64) accumulation over arbitrary iterators (with an optimized method for contiguous arrays).

It seems to be quite fast — for an array of 10⁴ elements, it is actually about the same speed as a naive sum without SIMD. With @simd (which is enabled for the built-in sum) it is about 10x slower, since Neal’s code does not seem to exploit SIMD.

julia> function naivesum(a)
           s = zero(eltype(a))
           for x in a
               s += x
           end
           return s
       end
mysum (generic function with 1 method)

julia> a = randn(10^4);

julia> using Xsum, BenchmarkTools

julia> @btime sum($a);
  1.107 μs (0 allocations: 0 bytes)

julia> @btime xsum($a);
  12.272 μs (0 allocations: 0 bytes)

julia> @btime naivesum($a);
  12.175 μs (0 allocations: 0 bytes)

julia> err(s, exact) = abs(s - exact) / abs(exact) # relative error
err (generic function with 1 method)

julia> exact = Float64(sum(big.(a))); err(sum(a), exact), err(xsum(a), exact), err(naivesum(a), exact)
(1.5635496798094023e-16, 0.0, 6.097843751256669e-15)

On my machine, Neal’s “large” accumulator starts being faster than the “small” accumulator around n=256.

9 Likes