Hi, I cannot understand what happens in Julia when taking the mean of an
To get the mean of a matrix
A per column, one can use the function
mean along the first dimension with
mean(A, dims = 1) or map the
mean function to each column with
Yet I accidentally forgot the dot and found out the following:
using Random, Statistics
A = randn(3,5)
mean(eachcol(A)) == mean.(eachrow(A))
mean(eachrow(A)) == mean.(eachcol(A))
This seems quite counter-intuitive to me and I could not figure out what exactly was happening. It seems as well that the underlying computation is distinct as for larger matrices
mean.(eachrow(A)) differ by a round-off error:
using Random, Statistics
B = randn(3000,5000)
mean(eachcol(B)) == mean.(eachrow(B))
mean(eachrow(B)) == mean.(eachcol(B))
Is that due to the definition of mean for generators? I could not find an answer from the doc or the forum, so if the information is actually there and I missed it, could you please point me to how to address such questions? Thanks!
What’s happening is that
mean adds the given items, and divides by how many. If those items happen to be vectors, addition and scalar division still make sense, so that’s what it does. By contrast,
mean.() applies the function to each item.
julia> mean([1:4, 5:8]) |> collect
julia> ((1:4) + (5:8)) / 2 |> collect
julia> mean.([1:4, 5:8])
julia> [mean(1:4), mean(5:8)]
I see, thanks! While I tend to find the
mean.(eachrow(A)) option clearer, is
mean(eachcol(A)) computationally more efficient? From your answer, I assume it relies on vector addition and division rather than computing several vector means. I see as well from a naive benchmark that
mean(eachcol(B)); takes less time and memory to run than
mean.(eachrow(B));, and that may explain as well the slight difference in precision.
Edit: I’m asking out of curiosity, but for reference if someone is interested in pure performance, it seems
mean(B, dims = 2) is faster anyway.
Right, the reason they aren’t exactly
== is that they perform the additions and divisions in different orders, and the inevitable small rounding errors will differ.
For speed it’s usually more efficient to operate on one big array, although there is talk of making some
eachslice form go there automatically, see e.g. https://github.com/JuliaLang/julia/issues/35292.
julia> u = rand(1000, 1000);
julia> @btime mean(eachcol($u));
742.352 μs (2002 allocations: 7.80 MiB)
julia> @btime mean.(eachrow($u));
1.035 ms (1004 allocations: 62.80 KiB)
julia> @btime mean($u, dims=2);
141.036 μs (7 allocations: 8.13 KiB)