Why `mean` with `dims` argument is so slow?

e3c6 · June 10, 2020, 1:15am

Consider:

using BenchmarkTools, Statistics
A = randn(4,3,2);
@btime mean($A; dims=1);
#  90.184 ns (1 allocation: 128 bytes)
@btime mean($A);
#  9.891 ns (0 allocations: 0 bytes)

That’s a 10x factor difference! Is there a way to improve over this? A faster way to compute mean along a dimension?

Henrique_Becker · June 10, 2020, 1:28am

Consider:

julia> A = randn(400,300,200);
julia> using BenchmarkTools
julia> @btime mean($A; dims=1);
  9.917 ms (2 allocations: 468.83 KiB)
julia> @btime mean($A);
  9.615 ms (0 allocations: 0 bytes)

Nanosseconds measures are bullshit.

ettersi · June 10, 2020, 3:11am

Perhaps it would be a bit more accurate to say that nanosecond measurements are very delicate because you can very easily end up measuring something other than what you intended to measure. I believe this is what is happening in the benchmark reported by the OP: mean(A) returns a scalar and so it is allocation-free, while mean(A, dims=1) returns a vector and so it must perform at least one allocation, and this allocation ends up dominating the overall runtime.

mbauman · June 10, 2020, 5:21am

Well, it’s an 80 ns difference. You can avoid some of it by pre-allocating the output in the shape you want:

julia> using BenchmarkTools, Statistics

julia> A = randn(4,3,2);

julia> @btime mean($A; dims=1);
  101.318 ns (1 allocation: 128 bytes)

julia> @btime mean($A);
  10.427 ns (0 allocations: 0 bytes)

julia> m = zeros(1, 3, 2);

julia> @btime mean!($m, $A);
  62.747 ns (0 allocations: 0 bytes)

e3c6 · June 10, 2020, 12:15pm

Oh, I should have tested on larger matrices!

Topic		Replies	Views
How to avoid unneccessary memory allocation for inplace estimation of `mean`? Performance question , memory-allocation	12	1097	February 27, 2020
Optimizing a 2D masked mean Performance	10	465	September 20, 2022
Performance difference between two code New to Julia	3	782	January 18, 2017
Why does a vector with 10 times more elements takes 2x-5x less time to pre-allocate? Performance question	14	370	October 30, 2024
How to improve runtime with measurements.jl? Performance question , measurements	11	1129	August 30, 2021

Why `mean` with `dims` argument is so slow?

Related topics