Identical functions repeated benchmarks show systematic differences

viraltux · July 27, 2021, 8:13pm

I think nonetheless your question raises a very good point. Most of the time the reason we use BenchmarkTools is not because we want to know how fast is A but rather if A is faster than B and by how much. A very good addition in my opinion to BenchmarkTools would be a macro to compare A vs B vs… X instead us guessing if one is faster than the others based on their statistics. This macro would also allow for internal bias reduction (reloading A and B and…, etc.) and running such macro for a long time should account as well for the whole machine/OS potential bias.

For now, going to the statistics on statistics idea we can run for a long time things like:

A() = for i in 1:1000 sin(i) end
B() = for i in 1:999 sin(i) end

using BenchmarkTools

n = 10
d = []
for i in 1:n
    println(i)
    push!(d,(@belapsed A()) - (@belapsed B()))
end

And then analyze results in ways like:

using HypothesisTests, Statistics, Plots
plot(d,label="A-B")
plot!([0], seriestype = :hline, label= "H0 μ=0")
plot!([mean(d)], seriestype = :hline, label = "μA - μB")

julia> pvalue(OneSampleTTest(mean(d),std(d),n))
0.004007649672446793

Also, since you’re working on this and you raised this problem, if I may I would suggest for you to open an issue to the BenchmarkTools team. I’d support the idea of a macro like @benchmark A B …

Topic		Replies	Views
Taking sorting & ordering seriously Internals & Design sort	15	2662	September 15, 2018
Execution time test General Usage question , benchmarktools	1	588	December 29, 2021
`sort` vs `copy`+`sort!` Performance sort	3	387	December 20, 2022
Why does changing this indexing slows the code so much? Performance	5	707	September 25, 2020
How to benchmark in-place functions? Performance question	8	1982	February 3, 2023

Identical functions repeated benchmarks show systematic differences

Related topics