What is the meaning of `total = f_value + zero(f_value)`?`

mzaffalon · December 26, 2016, 6:22pm

What is the reason to add zero(f_value) in line https://github.com/JuliaLang/julia/blob/071c60a32773edc1e0d0d50c0aa1c138f4f7a36f/base/statistics.jl#L18

Does it not evaluate to f_value without any change of type?

ScottPJones · December 26, 2016, 10:21pm

I think that code simply got edited, and the original reason for having a zero(f_value) got lost.
I think that code could be done more simply as follows:

function mean(f::Callable, iterable)
    state = start(iterable)
    done(iterable, state) &&
        throw(ArgumentError("mean of empty collection undefined: $(repr(iterable))"))
    count = 0
    total = zero(f_value)
    while true
        value, state = next(iterable, state)
        total += f(value)
        count += 1
        done(iterable, state) && return total/count
    end
end

kristoffer.carlsson · December 26, 2016, 10:57pm

Perhaps for type stability when Bool iterators are passed?

stevengj · December 26, 2016, 10:58pm

Probably it was for type stability, in case typeof(f_value + zero(f_value)) != typeof(f_value). If that happens, initializing total = f_value would cause the variable total to change type in the subsequent loop, leading to slow code.

In older versions of Julia, arithmetic with narrower integer types would automatically be widened to Int, leading to type instabilities if you weren’t careful in this way. Nowadays, the only built-in type for which + returns a different type is probably Bool: true + false === 1.

mzaffalon · December 27, 2016, 4:33pm

f_value does not seem to be defined.

mzaffalon · December 27, 2016, 4:36pm

Does it even make sense to compute the average of booleans?

ScottPJones · December 27, 2016, 4:37pm

Ah, yes! My bad, so the code does need to get the first value from the iterator and evaluate f(value) to get the initial value for total (but it doesn’t need zero() then).
I can imagine cases where you might want to know the ratio of trues / total, so you’d get 0.0 - 1.0 for a result.
If that f_value + zero(f_value) is really needed though to get the correct type for the loop, that line should be commented so that people don’t stumble upon it in the future, or have the intent made clearer.
If narrower integer types are no longer automatically widened to Int, this code might be breaking now for small integer types, since total really needs to be able to hold N*typemax(typeof(f_value)) without overflowing (where N is the [unknown] number of values returned by the iterator.

StefanKarpinski · December 28, 2016, 12:04am

The average of booleans is the fraction that are true, so yes, it makes sense and is one of the most basic things you want to know about a collection of booleans.

Topic		Replies	Views
Inconsistent behavior of `sum`,`mean` (and probably others) on different collection types Internals & Design	31	3551	September 19, 2017
zero(a::Real)? General Usage question , type , zeros	24	1584	December 6, 2023
Purpose of zero() in a function General Usage question	9	1003	November 29, 2018
On the arbitrariness of truth(iness) Internals & Design	51	5243	March 25, 2023
Mean of integers overflows - bug or expected behaviour? Statistics integer-overflow	94	4004	March 3, 2020

What is the meaning of `total = f_value + zero(f_value)`?`

Related topics