julia> N = 100000000; aa = rand(Float32, N); julia> mean((x for x in aa)) 0.16777216f0 julia> mean(aa) 0.500059f0
I know what’s happening: Float32 has limited accuracy, and
mean over an iterator computes a “naive” sum, which saturates at some point. However, it caught me by surprise in my real-world scenario:
julia> mean(skipmissing(Umat)) 1.0638367f0 V julia> mean(filter(!ismissing, Umat)) 3.1320891f0 V
That’s a bit scary. Is that just Something I Must Watch Out For? It feels like there should be a reasonable algorithm that gives good accuracy without sacrificing speed. Something like
function sum(iter) total = zero(eltype(iter)) minitotal = zero(eltype(iter)) i = 0 for x in iter if (i+=1) % 100000 == 0 total += minitotal minitotal = zero(eltype(iter)) end minitotal += x end return total + minitotal end
I’m not particularly good with numerical methods, but wouldn’t that be similar in efficiency, with much better accuracy?