BenchmarkTools with simple, fast-running function

robsmith11 · February 21, 2019, 2:55pm

I thought that BenchmarkTools was designed to be able to test simple, fast-running functions by evaluating them multiple times to avoid issues with the timing precision.

Using the example from the README, it doesn’t appear to me that it’s actually executing the function for each evaluation. Am I misunderstanding something?

julia> @benchmark sin(1) evals=1
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     18.000 ns (0.00% GC)
  median time:      20.000 ns (0.00% GC)
  mean time:        19.785 ns (0.00% GC)
  maximum time:     39.000 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark sin(1) evals=1000
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     0.017 ns (0.00% GC)
  median time:      0.020 ns (0.00% GC)
  mean time:        0.020 ns (0.00% GC)
  maximum time:     0.035 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark sin(1) evals=1000000
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     0.001 ns (0.00% GC)
  median time:      0.001 ns (0.00% GC)
  mean time:        0.001 ns (0.00% GC)
  maximum time:     0.001 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000000

bennedich · February 21, 2019, 3:15pm

Benchmarking is often tricky to get right for tiny functions, since Julia is so good at optimizing your code. Times of ~0.01 ns are the result of the compiler replacing your expression with a constant, and you end up benchmarking nothing at all. (1 nanosecond is ~3 CPU clock cycles on a 3 GHz computer, so 0.01 ns is not enough to do anything.)

To get around that, I usually try to structure the benchmark expression in such a way that the compiler can’t constant-fold it or cheat in any other way. In the example below, I also chose a vector large enough that the CPU can’t learn the branching behavior.

julia> v = rand(100_000);

julia> a = similar(v);

julia> @btime $a .= sin.($v);
  626.487 μs (0 allocations: 0 bytes)

100k calls in 626 μs equals around 6.26 ns per call to sin (if called through broadcast).

And yes, the example in the BenchmarkTools manual is now broken, and it would be great if they could talk more about these difficulties. Cf. this issue:

https://github.com/JuliaCI/BenchmarkTools.jl/issues/130

robsmith11 · February 21, 2019, 9:29pm

Thanks. I had suspected that but thought that because it was being used as an example, BenchmarkTools had some way to disable the optimization. For example, Rust has black_box that forces evaluation.

I’ve created a issue to hopefully avoid future confusion:
https://github.com/JuliaCI/BenchmarkTools.jl/issues/134

bennedich · February 21, 2019, 9:36pm

I think this PR was meant to address that (disabling optimizations), but unfortunately it was never completed.

Topic		Replies	Views
How to benchmark properly? Should defaults change? General Usage benchmarktools	13	868	August 10, 2021
BenchmarkTools New to Julia	6	2241	July 16, 2020
Easy way to run benchmarks quickly Performance benchmarktools	2	399	October 27, 2021
How to specify the number of execution and the number of repetitions per execution in BenchmarkTools? General Usage question	16	2603	September 2, 2021
BenchmarkTools setup isn't run between each iteration? General Usage question , benchmarktools	6	1501	November 17, 2021

BenchmarkTools with simple, fast-running function

Related topics