# Does the julia intrinsic sum() apply fastmath by default?

I noticed the output of sum() is different from custom loops

``````julia> function mysum(A)
out = 0.0
for ii in eachindex(A)
out += A[ii]
end
return out
end
mysum (generic function with 1 method)

julia> A=rand(1000);

julia> sum(A)-mysum(A)
-1.1368683772161603e-13
``````

In FORTRAN(gfortran) this only happens when -ffastmath is enabled. Is this by design, or there is something else I missed? If this is expected, what other functions have this behavior?

Julia’s `sum` uses pairwise summation which is more accurate and usually faster than the regular loop.

8 Likes

More accurate, yes. But faster? My understanding is that it is ‘almost as fast’.

Theoretically it should be able to be faster for arrays that fit in cache since there are better dependency chains.

How short is that then? When testing, I find this loop to be consistently faster than `sum` for lengths from 10^1 to 10^6:

``````function mysum(xs)
val = zero(eltype(xs))
@simd for i in eachindex(xs)
@inbounds val += xs[i]
end
return val
end
``````

It’s a small difference, but consistent.

Compared to a non-simd loop, yes, that would be expected.

Note that with `@simd` you’re allowing reordering of the loop, which defeats the premise of Oscar’s statement.

4 Likes

Good to know. Thanks.
And why does sum() sometimes not agree with itself either?

``````julia> function rowsum(A)
out = Array{Float64,2}(undef, size(A, 1),1)
for ii in axes(A, 1)
out[ii,1] = sum(A[ii, :])
end
return out
end
rowsum (generic function with 1 method)

julia> A=rand(10,1000);

julia> sum(A,dims=2)-rowsum(A)
10×1 Matrix{Float64}:
2.2737367544323206e-13
2.2737367544323206e-13
4.547473508864641e-13
-2.2737367544323206e-13
7.389644451905042e-13
1.7053025658242404e-13
-2.8421709430404007e-13
-2.2737367544323206e-13
-5.115907697472721e-13
5.684341886080802e-14
``````

because `sum(A, dims=2)` (when you have dims) is not implemented like that, it’s more like a naive sum

1 Like

By naive sum do you mean like the loop I wrote at the start?

In the case where you have a large number of similar-magnitude integers (so their sum is `BigInt` territory) I think pairwise summation is asymptotically faster than sequential

That is interesting to me. I would have thought that base sum always used Kahan summation, at least on items with known size. I wonder why that is not the case.

Kahan summation is roughly 2x slower.

3 Likes

Ah sorry, I tend to always equate the term pairwise sum with Kahan sum. Anyways, what I meant is I’m wondering why summing along other dimensions would be any different than a standard sum.

Not entirely sure. My guess would be that it’s just that it’s easier to write it as a maprecuce than writing the code to do pairwise summation over arbitrary dimensions.

In general, the reduction operations in Julia (`mapreduce`, `sum`, …) are documented to treat the operator as associative, and to use an implementation-dependent associativity. If you want a particular associativity you have to use `foldl`/`foldr` or similar (or write your own loop).

4 Likes

Good to know. I wasn’t aware this apply to not only `map` but also others.