Hi, I cannot understand what happens in Julia when taking the mean of an eachrow
or eachcol
generator.
To get the mean of a matrix A
per column, one can use the function mean
along the first dimension with mean(A, dims = 1)
or map the mean
function to each column with mean.(eachcol(A))
.
Yet I accidentally forgot the dot and found out the following:
using Random, Statistics
A = randn(3,5)
mean(eachcol(A)) == mean.(eachrow(A))
# True
mean(eachrow(A)) == mean.(eachcol(A))
# True
This seems quite counter-intuitive to me and I could not figure out what exactly was happening. It seems as well that the underlying computation is distinct as for larger matrices mean(eachcol(A))
and mean.(eachrow(A))
differ by a round-off error:
using Random, Statistics
B = randn(3000,5000)
mean(eachcol(B)) == mean.(eachrow(B))
# False
mean(eachrow(B)) == mean.(eachcol(B))
# False
all(isapprox.(mean(eachcol(B)), mean.(eachrow(B))))
# True
all(isapprox.(mean(eachrow(B)), mean.(eachcol(B))))
# True
Is that due to the definition of mean for generators? I could not find an answer from the doc or the forum, so if the information is actually there and I missed it, could you please point me to how to address such questions? Thanks!
What’s happening is that mean
adds the given items, and divides by how many. If those items happen to be vectors, addition and scalar division still make sense, so that’s what it does. By contrast, mean.()
applies the function to each item.
Simpler examples:
julia> mean([1,2,3,4])
2.5
julia> mean([1:4, 5:8]) |> collect
4-element Array{Float64,1}:
3.0
4.0
5.0
6.0
julia> ((1:4) + (5:8)) / 2 |> collect
4-element Array{Float64,1}:
3.0
4.0
5.0
6.0
julia> mean.([1:4, 5:8])
2-element Array{Float64,1}:
2.5
6.5
julia> [mean(1:4), mean(5:8)]
2-element Array{Float64,1}:
2.5
6.5
2 Likes
I see, thanks! While I tend to find the mean.(eachrow(A))
option clearer, is mean(eachcol(A))
computationally more efficient? From your answer, I assume it relies on vector addition and division rather than computing several vector means. I see as well from a naive benchmark that mean(eachcol(B));
takes less time and memory to run than mean.(eachrow(B));
, and that may explain as well the slight difference in precision.
Edit: I’m asking out of curiosity, but for reference if someone is interested in pure performance, it seems mean(B, dims = 2)
is faster anyway.
Right, the reason they aren’t exactly ==
is that they perform the additions and divisions in different orders, and the inevitable small rounding errors will differ.
For speed it’s usually more efficient to operate on one big array, although there is talk of making some eachslice
form go there automatically, see e.g. https://github.com/JuliaLang/julia/issues/35292.
julia> u = rand(1000, 1000);
julia> @btime mean(eachcol($u));
742.352 μs (2002 allocations: 7.80 MiB)
julia> @btime mean.(eachrow($u));
1.035 ms (1004 allocations: 62.80 KiB)
julia> @btime mean($u, dims=2);
141.036 μs (7 allocations: 8.13 KiB)
2 Likes