I sometimes get different results for summing a vector of Float64 numbers on different Windows machines. The vectors are identical, so it seems sum can produce different results across hardware architectures (even if os is always Windows).
Is this expected?
Is there a way to guarantee the same results?
Would sum_kbn from KahanSummation, or xsum from Xsum guarantee the same result?
Both of these alternative sums produce identical results across machines on my particular data.
Yes, the results will be hardware dependent. Based on the width of the CPU’s vector registers, a different order of adding the numbers will be optimal. These different orders can slightly change rounding.
You can use foldl(+, x) instead of sum(x) to guarantee a particular order.
Most Kahan Summations implementations aren’t SIMD and thus will be exactly the same across architectures (AccurateArithmetic is), and of course Xsum promises to be exactly rounded, so you obviously shouldn’t see any rounding differences there.
Here is a fun example of the lack of associativity (thanks to Stefan Karpinski):
0 <= x < exp2(970) || throw(ArgumentError("sum must be in [0,2^970)"))
n, p₀ = Base.decompose(x) # integers such that `n*exp2(p₀) == x`
[floatmax(); [exp2(p) for p in -1074:969 if iseven(n >> (p-p₀))]
-floatmax(); [exp2(p) for p in -1074:969 if isodd(n >> (p-p₀))]]
julia> x = sumsto(2.3);
julia> y = sumsto(1e18);
julia> sort(x) == sort(y)