What is the reason to add zero(f_value)
in line https://github.com/JuliaLang/julia/blob/071c60a32773edc1e0d0d50c0aa1c138f4f7a36f/base/statistics.jl#L18
Does it not evaluate to f_value
without any change of type?
What is the reason to add zero(f_value)
in line https://github.com/JuliaLang/julia/blob/071c60a32773edc1e0d0d50c0aa1c138f4f7a36f/base/statistics.jl#L18
Does it not evaluate to f_value
without any change of type?
I think that code simply got edited, and the original reason for having a zero(f_value) got lost.
I think that code could be done more simply as follows:
function mean(f::Callable, iterable)
state = start(iterable)
done(iterable, state) &&
throw(ArgumentError("mean of empty collection undefined: $(repr(iterable))"))
count = 0
total = zero(f_value)
while true
value, state = next(iterable, state)
total += f(value)
count += 1
done(iterable, state) && return total/count
end
end
Perhaps for type stability when Bool
iterators are passed?
Probably it was for type stability, in case typeof(f_value + zero(f_value)) != typeof(f_value)
. If that happens, initializing total = f_value
would cause the variable total
to change type in the subsequent loop, leading to slow code.
In older versions of Julia, arithmetic with narrower integer types would automatically be widened to Int
, leading to type instabilities if you weren’t careful in this way. Nowadays, the only built-in type for which +
returns a different type is probably Bool
: true + false === 1
.
f_value
does not seem to be defined.
Does it even make sense to compute the average of booleans?
Ah, yes! My bad, so the code does need to get the first value from the iterator and evaluate f(value) to get the initial value for total (but it doesn’t need zero() then).
I can imagine cases where you might want to know the ratio of trues / total, so you’d get 0.0 - 1.0 for a result.
If that f_value + zero(f_value) is really needed though to get the correct type for the loop, that line should be commented so that people don’t stumble upon it in the future, or have the intent made clearer.
If narrower integer types are no longer automatically widened to Int, this code might be breaking now for small integer types, since total
really needs to be able to hold N*typemax(typeof(f_value))
without overflowing (where N is the [unknown] number of values returned by the iterator.
The average of booleans is the fraction that are true, so yes, it makes sense and is one of the most basic things you want to know about a collection of booleans.