How to specify the number of execution and the number of repetitions per execution in BenchmarkTools?

Hi everyone,

I’m trying to benchmark the speed of a function that I ported to Julia using BenchmarkTools.

In Python with ipython, I used %timeit -n 7 test_speed(X) to generate the benchmark speeds with this result;

1min 6s ± 2.31 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

I want to replicate this benchmark trial to make a fair assessment so I tried this:

results = @benchmark test_speed(X) evals=7

with this result;

BenchmarkTools.Trial: 
  memory estimate:  15.91 GiB
  allocs estimate:  84727240
  --------------
  minimum time:     34.098 s (58.05% GC)
  median time:      34.098 s (58.05% GC)
  mean time:        34.098 s (58.05% GC)
  maximum time:     34.098 s (58.05% GC)
  --------------
  samples:          1
  evals/sample:     7

I’m, however, not sure if that’s the correct way of doing it since the time is pretty much the same in terms of minimum, median etc.

Am I doing something wrong?

I’m often confused regarding the difference between evals and samples, see if you can sense of it from here
https://github.com/JuliaCI/BenchmarkTools.jl/blob/master/doc/manual.md#benchmark-parameters
It sounds like it to me that you should specify samples instead.

1 Like

Here’s the relevant explanation

The reasoning behind our definition of “sample” may not be obvious to all readers. If the time to execute a benchmark is smaller than the resolution of your timing method, then a single evaluation of the benchmark will generally not produce a valid sample. In that case, one must approximate a valid sample by recording the total time t it takes to record n evaluations, and estimating the sample’s time per evaluation as t/n . For example, if a sample takes 1 second for 1 million evaluations, the approximate time per evaluation for that sample is 1 microsecond. It’s not obvious what the right number of evaluations per sample should be for any given benchmark, so BenchmarkTools provides a mechanism (the tune! method) to automatically figure it out for you.

Sonit seems in your case, you should leave evals at the default, 1, and set samples to 7

I will try that and report. It’s really confusing for me as well. Trying r = @benchmark test_speed(X) samples=7 changes nothing as I got this;

BenchmarkTools.Trial: 
  memory estimate:  15.99 GiB
  allocs estimate:  86012859
  --------------
  minimum time:     34.815 s (58.44% GC)
  median time:      34.815 s (58.44% GC)
  mean time:        34.815 s (58.44% GC)
  maximum time:     34.815 s (58.44% GC)
  --------------
  samples:          1
  evals/sample:     1

Now I’m super confused :sob:

Hmm, you might have hit the time limit since you didn’t hit the sample threshold

Defaults to BenchmarkTools.DEFAULT_PARAMETERS.seconds = 5 .

You must set time limit to above 7 times the time you just measured.

BTW, there’s probably some room for optimization before you benchmark super accurately since you allocate a huge amount of memory.

1 Like

I’m increasing the seconds to 300 and yes, there is so much room for optimization since I’m a julia rookie. Even with that room, it’s actually blowing the python implementation away (takes less than half the time it does in Python).

I will report what the new benchmark results show.

Sweet, I bet you can make that at least 10x :smiley:

1 Like

New results after setting the seconds higher:

BenchmarkTools.Trial: 
  memory estimate:  14.17 GiB
  allocs estimate:  74011729
  --------------
  minimum time:     28.436 s (56.82% GC)
  median time:      32.221 s (58.45% GC)
  mean time:        32.082 s (58.02% GC)
  maximum time:     35.245 s (58.26% GC)
  --------------
  samples:          7
  evals/sample:     1

Thanks for the tip. It would have been great it the seconds auto adjusted to the number of samples specified :slight_smile:

1 Like

The time is a benchmarking budget, not a desired time, That’s why benchmarking stops when the first limit is met.

2 Likes

FWIW, you don’t really need BenchmarkTools if your timing is on the order of seconds. You can just use @time.

1 Like

Unlike @time BenchmarkTools computes statistics (median, mean time etc) and has nice representation of the result. And it’s easier to use one tool than switching from one to another.

1 Like

You can use BenchmarkTools if preferred, but for long-running calculations where you did a run to compile things already there are diminishing returns to it.

You will get a statistical sample, but it is usually a small one so the benefits are frequently marginal.

I upgraded to Julia 1.3.1 and the speed test is even better now:

BenchmarkTools.Trial: 
  memory estimate:  14.40 GiB
  allocs estimate:  76011754
  --------------
  minimum time:     16.140 s (11.75% GC)
  median time:      16.620 s (13.68% GC)
  mean time:        16.810 s (14.36% GC)
  maximum time:     17.806 s (16.92% GC)
  --------------
  samples:          7
  evals/sample:     1

while the python implementation in a heavily optimized library like sklearn is almost 5x slower:

1min 21s ± 2.85 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

I still can’t believe it~ :flushed: :astonished:

That’s great news :slight_smile:
I still think you can squeeze some extra performance out of julia by working on the allocations, even though I have no idea what code you are actually benchmarking :stuck_out_tongue:

I’m cleaning up the code. I will make a post soon for optimization tips since I have never really dealt with low level languages so I have no clue about allocations (but I am eager to learn now) :slight_smile:

Great :slight_smile: When you make such a post, make sure to include self-contained code such that anyone can just copy and paste it and get a timing result for themselves to improve upon. That’s a great way to ensure that you get a lot of people hooked on making the code faster :stuck_out_tongue:

3 Likes

Updated link https://github.com/JuliaCI/BenchmarkTools.jl/blob/master/docs/src/manual.md#benchmark-parameters