What is the meaning of `total = f_value + zero(f_value)`?`

What is the reason to add zero(f_value) in line https://github.com/JuliaLang/julia/blob/071c60a32773edc1e0d0d50c0aa1c138f4f7a36f/base/statistics.jl#L18

Does it not evaluate to f_value without any change of type?

1 Like

I think that code simply got edited, and the original reason for having a zero(f_value) got lost.
I think that code could be done more simply as follows:

function mean(f::Callable, iterable)
    state = start(iterable)
    done(iterable, state) &&
        throw(ArgumentError("mean of empty collection undefined: $(repr(iterable))"))
    count = 0
    total = zero(f_value)
    while true
        value, state = next(iterable, state)
        total += f(value)
        count += 1
        done(iterable, state) && return total/count
    end
end
1 Like

Perhaps for type stability when Bool iterators are passed?

Probably it was for type stability, in case typeof(f_value + zero(f_value)) != typeof(f_value). If that happens, initializing total = f_value would cause the variable total to change type in the subsequent loop, leading to slow code.

In older versions of Julia, arithmetic with narrower integer types would automatically be widened to Int, leading to type instabilities if you weren’t careful in this way. Nowadays, the only built-in type for which + returns a different type is probably Bool: true + false === 1.

2 Likes

f_value does not seem to be defined.

Does it even make sense to compute the average of booleans?

Ah, yes! My bad, so the code does need to get the first value from the iterator and evaluate f(value) to get the initial value for total (but it doesn’t need zero() then).
I can imagine cases where you might want to know the ratio of trues / total, so you’d get 0.0 - 1.0 for a result.
If that f_value + zero(f_value) is really needed though to get the correct type for the loop, that line should be commented so that people don’t stumble upon it in the future, or have the intent made clearer.
If narrower integer types are no longer automatically widened to Int, this code might be breaking now for small integer types, since total really needs to be able to hold N*typemax(typeof(f_value)) without overflowing (where N is the [unknown] number of values returned by the iterator.

The average of booleans is the fraction that are true, so yes, it makes sense and is one of the most basic things you want to know about a collection of booleans.

1 Like