# Inconsistent behavior of `sum`,`mean` (and probably others) on different collection types

#1

I’ve discovered a lot of inconsistent behavior today with the `sum` and `mean` functions (while investigating a question asked here on Discourse), some of which are regressions compared to v0.4.x.
For example, on v0.4.x:

``````julia> sum([0xff,0xff])
0x00000000000001fe
julia> sum((0xff,0xff))
0xfe
julia> sum([0xff for i = 1:65536*256*8])
0x00000007f8000000

julia> mean([0xff,0xff])
255.0
julia> mean((0xff,0xff))
127.0
julia> mean([0xff for i = 1:65536*256*8])
255.0
``````

On v0.5.0:

``````julia> sum([0xff,0xff])
0x000001fe
julia> sum((0xff,0xff))
0xfe
julia> sum([0xff for i = 1:65536*256*8])
0xf8000000

julia> mean([0xff,0xff])
255.0
julia> mean((0xff,0xff))
127.0
julia> mean([0xff for i = 1:65536*256*8])
31.0
``````

So, collections that aren’t AbstractArray get incorrect results in both versions, but v0.5.0 gets incorrect results when summing or calculating the mean of larger collections, because it promotes only to a UInt32, instead of a UInt64.
This makes me worry that even more functions are broken in v0.5 and later, due to depending the previous promotion rules (and even then, some methods on abstract collections have always been broken).

#2

The possibility of overflow in Julia’s arithmetic is not a bug, it was an intentional design decision that has been extensively discussed elsewhere. The question is, by how much should the type be widened for a reduction?

This was changed to widen to 32 bits (instead of 64 bits on 64-bit machines) in https://github.com/JuliaLang/julia/pull/14414 for the reasons described in that PR. Again, a design decision, not a bug. (In fact, you responded “LGTM” to that PR. )

If you want to use `Int`, you can instead do `reduce(+, 0, array)`.

#3

Though `sum((0xff,0xff))` is clearly inconsistent with the equivalent method for arrays, and should probably also promote to `UInt32`. (I would also find it more logical for all cases to promote to `UInt16` even if it might be slower, as was discussed on the PR.)

#4

I think you missed what I feel is the most important part of what I saw, that an operation on a set of values should not give different results based on what type of collection holds the values (as long as iterating over the values produces them in the same order, as there are potential issues since the operations may not be associative, such as addition on fixed precision floating point).
Yes, I believe that `widen` is better promoting to `Int32` and `UInt32` rather than `Int` and `UInt` for `Int16/Int8`, but I think that is a separate issue. I noticed the change in `sum` because I was looking at the problems with inconsistent results from `mean`.
For certain types, `mean` is implemented with `sum`, but for others, it does the addition itself, using the promotion of `val + zero(typeof(val))`.
Is there anything that requires `sum` to use a particular promotion (and to not check for overflow)?
There is a special case for `sum` of `Bool`, for example, so why not special promotion rules for other types?
That is especially true for `mean`, where the result is not even of the same type (or widened type).
Also, I think there are other ways of implementing `mean` that can avoid getting totally incorrect results, while still being performant.
I’d thought I’d read in one of the long discussions that overflow was not checked for simple arithmetic operations, for performance reasons, but for more expensive operations, Julia tended towards trying to return the correct results or throw an exception. In those cases, silently getting the wrong answer fast, doesn’t seem all that useful to me.

#5

`sum` could certainly be implemented to use `Int64` promotion for smaller integer types, but then it would be slow on 32-bit machines (and be a bit slower on 64-bit machines). And checking for overflow would be (a) much slower and (b) type-unstable if you widen rather than throwing an exception.

(One possibility would be to use `Float64` accumulation, which is exact up to 53-bit integers and is fast on all architectures, but then of course there is the question of what integer type to convert the final `sum` result to and how to handle cases that overflow 53 bits, incurring roundoff errors. `Float64` accumulation makes even more sense for `mean`, since the result is returned in floating-point form anyway.)

Still, the current behavior is not a crazy tradeoff: if you want the space efficiency of working with e.g. 8-bit or 16-bit integers, the price is that you have to think a little bit more carefully about overflow (e.g. by calling `reduce` if you want to specify `Int64` or `Float64` accumulation).

Prior to the PR I mentioned, the behavior was to promote to `Int`, which is fast on both 32-bit and 64-bit machines, but then you get different results when you run on the different architectures. The behavior of `widen` is not an entirely “separate issue” from the behavior of `sum` — the considerations here are very similar to those in that PR.

#6

Hmm, I just noticed that `sum` widens `Int32` to `Int`, which means that the results are still architecture-dependent anyway. Given that, it is not crazy to widen `Int16` and `Int8` to `Int` too.

#7

(More simply, just `sum(Int, array)`.)

#8

You are still focusing on the issues of promotion. What about the issue of inconsistent results from different container types (even when the iterators produce the exact same values in the same order)?

#9

The question of `Tuple` handling is separate; I tend to agree with @nalimilan that `sum` should probably treat arrays and tuples similarly with respect to promotion.

A tip that you’ve heard before: raising multiple independent issues in a single thread confuses the discussion. Nor does calling design decisions you disagree with (or the absence of features you would like) “bugs” or “broken” facilitate productive discourse.

Was: Inconsistent behavior of `sum` and `mean`
#10

I am simply trying to raise an issue about some related cases that I found where some Julia functions give unexpected, inconsistent results. I have not said anything at all here about disagreeing with any design issues or features that are lacking from Julia.

Do you agree with the principle that if you have a function that operates on a set of values in a collection, that function should return the same result (if the values are iterated over in the same order)?
In this particular case, it is strictly an implementation detail that `mean` (in only some cases!) uses `sum` to calculate its result, which ends up giving inconsistent, incorrect results, depending on the type of collection the values are in.
I can already think of a number of different ways that `mean` can be implemented that would avoid that happening (and without checking for overflow for every addition either).

#11

You’ve raised at least four separate issues: whether `sum` should promote to a wider type than `Int32` for small integer types (or somehow check for overflow), whether `mean` should use a different algorithm (not `sum`/`length`) to make it less susceptible to overflow for integer collections, whether the algorithms of all functions should be independent of container type, and behavior changes from 0.4 to 0.5.

I agree that the algorithm of `mean` could be made less susceptible to overflow for integer arrays, most simply by using floating-point summation (although there are other possibilities).

I don’t necessarily agree that `f(collection::T)` should necessarily be independent of `T` for all `f` and `T`. For one thing, this precludes many improved algorithms that are applicable only to particular container types. (e.g. summation of floating-point values can be made vastly more accurate for arrays allowing random access than for iterators that only expose sequential access, unless one is willing to slow down the iterator case by a lot.) For another thing, this constraint is impossible to enforce in a multiple-dispatch language. That being said, in the specific case of `sum`, I agree that it would be reasonable to change `sum(::Tuple)` to use the same algorithm as for `Array`.

#12

it would not only be notably slower, but also violate the duck typing the method now has. for example now you can use complex valued functions, and have a complex result. you can of course solve this particular situation by using float(), but what about a type that does not support float(), but does support +() and zero(), and the result of +() does support /(::T, ::Integer) ?

edit: i kinda lost track which topic i’m in. this particular method of mean came up in another topic. but the principle is the same: if you force an intermediate float, you lose generality.

#13

Throwing a few ideas out in the form of rhetorical-ish questions:

Is wraparound ever desirable in practical situations? Wouldn’t you always want to throw an overflow error or return NaN? (Is an overflow even a NaN?)

Shouldn’t it be the user’s responsibility to cater for potential overflow errors, rather than the developer having to contort Julia’s internals? It’s reasonable to expect some level of engineering skill…

As noted, the user can specify `sum( T, [foo, bar] )` to manually force promotion to T. Isn’t this one of the Zen rules of Python: “Explicit trumps implicit”?

It must be difficult to implement a consistent automatic selective overflow checking internally: if checking is not present for `+`, how can it be present for `sum`, which is just a bunch of `+` operations in a loop?

Overflow checking is clearly at odds with performance. I wonder if it might make sense, rather than for Julia to make difficult trade-off decisions, to have some global `overflow_check` flag that would be on by default but can be turned off. By offing the flag the user is signing on the dotted line that they know what they’re doing. And maybe get the `@fastmath` macro to disable overflow checking with `tmp = overflow_check; overflow_check = false; ...; overflow_check = tmp`.

Speed is valuable, but a clean consistent simple interface is priceless. Machines get faster each year and 32 bit systems are arguably on the way out. So I’m not sold on the arguments for promotion to Int32 rather than Int64. If the user wishes to take advantage of specific architectures, let them manually supply the promotions. Let them optimise the code. But keep the internals of Julia clean as a whistle!

i.e. They can specify `(a+b)::t` – is that the same as `+(T, a, b)` or `+( promote(T, a, b) ... )`? Or does Julia first do `a+b` and subsequently convert to `T`?

I’ve just bumped into a similar promotion issue elsewhere (https://github.com/JuliaLang/julia/issues/19714)

I wonder if rather than just a `check_overflow` flag, maybe Julia could also have an overflow policy – on overflow:

• NaN
• warn
• error

… and a promotion policy:

• use biggest type (so 8 8 -> 8, 8 16 -> 16 etc.)
• next power of two (so 8 8 -> 16, 8 16 -> 32 etc.)
• 32-bit optimised: promote <32 to 32, <64 to 64 (so 8 8 -> 32, 8 16 -> 32, 16 32 -> 64 etc.)
• 64-bit: promote everything to 64, so (8 8 -> 64 etc.)

#14

@pitao Please don’t make this issue broader than it is (and apologizing about the signal-to-noise ratio as a preamble doesn’t really help). Deciding on the best behavior for `sum` and `mean` is already enough for on thread. Checking for overflow everywhere is a hard question, see for example this issue.

@pint `float` works for `Complex` arguments, it returns a `Complex{<:AbstractFloat}` value. Wouldn’t that be generic enough for a `mean` implementation?

@ScottPJones Do you think you could prepare a branch to fix the inconsistent behavior for tuples? Then I can open a PR for you.

#15

you mean the float() function. yes, i was acknowledging that, but it is already different than the proposal of using Float as accumulator. however, as i said, what about types that does not support float()? and don’t ask me to give an example. being general is a good thing, well, in general .

but you know what? here is an example: now i can compute the mean of rationals. if you float-ize the accumulator, mean of rationals will be float.

#16

Right. I guess the most general and most correct approach would be to accumulate using the type of `(one(T)+one(T))/one(T)`, i.e. the return type of `mean`. The only drawback is that it could be slower than preserving the input type (notably for small integers), but it’s also the best way of preventing silent overflow in most cases.

#17

I would only use this method for a set of types known to work: `T<:smallinteger` and `Complex{T}`. There’s no way to make `mean` immune to overflow for arbitrary user-defined types. I agree that there is a significant performance tradeoff here, and it is debatable whether it is worth it. (For integers < 32 bits, another option would be to do the sums in sufficiently small blocks.)

(There are lots of cases in Base where we have multiple methods for the “same” function, to take advantage of properties of individual types. e.g. `mean(a)` has 3 methods already.)

The basic decision in Julia was to make arithmetic on fixed-width integer types fast, at the price that the programmer has the responsibility to worry about overflow. We can and should mitigate this where it is practical to do so, and Julia will no doubt improve incrementally in this regard, but the basic tradeoff isn’t going to disappear.

#18

@nalimilan Yes (although, it’s not just tuples, it seems to be anything that is not an `AbstractArray`). Thanks, that would be great. I’m only trying to improve the overall quality of Julia here (and I’m not saying that the overall quality is bad at all (I wouldn’t be using it for production code ever if I felt that way!) it’s just that nothing is perfect (even if we all feel that Julia is closer to perfection than any other language we know!)
@stevengj I’d said that the results were inconsistent between v0.4.x and later versions for `sum`, not that that necessarily was a bug, however, where is it written that `sum` must use the `widen` promotion? (There is none, for `Bool`, and it promotes to `Int`). That seems to be again an undocumented implementation detail. I think an actionable item in this case would be to at least add to the documentation for `sum`, explaining what promotion is used, and how to work around the issue by using `sum(Int,...)` or `sum(UInt,...)` (I can make a branch for that as well, if some nice person can then make a PR out of it).

#19

Remember that the problem with `mean` is the inconsistency based on collection type.
The result type for `mean` on an `AbstractArray` of Integer types is `typeof(one(widen(T))/1)`, (not `one(T)`),
for Bool types it is `typeof(1/1)`, and for other iterables, `typeof((zero(T)+zero(T))/1)`.
I’d be rather concerned also about the performance consequences also of using a Float type here do to the summation.

split this topic #20

2 posts were split to a new topic: Was: Inconsistent behavior of `sum` and `mean`

Was: Inconsistent behavior of `sum` and `mean`