Julia equivalent of Python's "fsum" for floating point summation

python

#22

Radford’s C code has now been posted under an MIT license: gitlab.com/radfordneal/xsum


#23

Radford’s C code is designed for summing C doubles. We could create an interface to the C code, and that would work summing many Float64s accurately. Or, we could reimplement it in Julia an allow for summing T<:AbstractFloat. The maximum number of floats summable by the small superaccumulator will vary with the precision of the values being summed.


#24

Also, in a Julia implementation the size of the accumulator could be a type parameter, making it possible to use less memory when the magnitude of the sum is known to be within a given range.

Having read Radford’s paper, I’m worried that the accumulator will completely fill the L1 cache and therefore be slower in practice than the benchmarks show…


#25

His superaccumulator is 536 bytes for double precision (67 64-bit chunks). L1 cache sizes are typically ~32kB.


#26

The paper discusses two different superaccumulators, where the “fast” one is 4096 x 64 bit (16 kB).


#27

It also argues that most of these entries are not used for typical data with limited range. In any case, it claims at most a factor of two benefit from the “fast” method vs. the “small” method; I suspect that most cases requiring an exactly rounded sum are not performance-critical enough for this to matter. It would be nice to have an implementation of at least the “small” method in a Julia package for people who need exactly rounded floating-point sums.


#28

This is exactly what I was referring to. It’d be easy to write an implementation of the “fast” method with user-selectable range and memory footprint.