How to specify the number of execution and the number of repetitions per execution in BenchmarkTools?

Hi everyone,

I’m trying to benchmark the speed of a function that I ported to Julia using BenchmarkTools.

In Python with ipython, I used %timeit -n 7 test_speed(X) to generate the benchmark speeds with this result;

1min 6s ± 2.31 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

I want to replicate this benchmark trial to make a fair assessment so I tried this:

results = @benchmark test_speed(X) evals=7

with this result;

  memory estimate:  15.91 GiB
  allocs estimate:  84727240
  minimum time:     34.098 s (58.05% GC)
  median time:      34.098 s (58.05% GC)
  mean time:        34.098 s (58.05% GC)
  maximum time:     34.098 s (58.05% GC)
  samples:          1
  evals/sample:     7

I’m, however, not sure if that’s the correct way of doing it since the time is pretty much the same in terms of minimum, median etc.

Am I doing something wrong?

I’m often confused regarding the difference between evals and samples, see if you can sense of it from here

It sounds like it to me that you should specify samples instead.

1 Like

Here’s the relevant explanation

The reasoning behind our definition of “sample” may not be obvious to all readers. If the time to execute a benchmark is smaller than the resolution of your timing method, then a single evaluation of the benchmark will generally not produce a valid sample. In that case, one must approximate a valid sample by recording the total time t it takes to record n evaluations, and estimating the sample’s time per evaluation as t/n . For example, if a sample takes 1 second for 1 million evaluations, the approximate time per evaluation for that sample is 1 microsecond. It’s not obvious what the right number of evaluations per sample should be for any given benchmark, so BenchmarkTools provides a mechanism (the tune! method) to automatically figure it out for you.

Sonit seems in your case, you should leave evals at the default, 1, and set samples to 7

I will try that and report. It’s really confusing for me as well. Trying r = @benchmark test_speed(X) samples=7 changes nothing as I got this;

  memory estimate:  15.99 GiB
  allocs estimate:  86012859
  minimum time:     34.815 s (58.44% GC)
  median time:      34.815 s (58.44% GC)
  mean time:        34.815 s (58.44% GC)
  maximum time:     34.815 s (58.44% GC)
  samples:          1
  evals/sample:     1

Now I’m super confused :sob:

Hmm, you might have hit the time limit since you didn’t hit the sample threshold

Defaults to BenchmarkTools.DEFAULT_PARAMETERS.seconds = 5 .

You must set time limit to above 7 times the time you just measured.

BTW, there’s probably some room for optimization before you benchmark super accurately since you allocate a huge amount of memory.

1 Like

I’m increasing the seconds to 300 and yes, there is so much room for optimization since I’m a julia rookie. Even with that room, it’s actually blowing the python implementation away (takes less than half the time it does in Python).

I will report what the new benchmark results show.

Sweet, I bet you can make that at least 10x :smiley:

1 Like

New results after setting the seconds higher:

  memory estimate:  14.17 GiB
  allocs estimate:  74011729
  minimum time:     28.436 s (56.82% GC)
  median time:      32.221 s (58.45% GC)
  mean time:        32.082 s (58.02% GC)
  maximum time:     35.245 s (58.26% GC)
  samples:          7
  evals/sample:     1

Thanks for the tip. It would have been great it the seconds auto adjusted to the number of samples specified :slight_smile:

1 Like

The time is a benchmarking budget, not a desired time, That’s why benchmarking stops when the first limit is met.


FWIW, you don’t really need BenchmarkTools if your timing is on the order of seconds. You can just use @time.

1 Like

Unlike @time BenchmarkTools computes statistics (median, mean time etc) and has nice representation of the result. And it’s easier to use one tool than switching from one to another.

1 Like

You can use BenchmarkTools if preferred, but for long-running calculations where you did a run to compile things already there are diminishing returns to it.

You will get a statistical sample, but it is usually a small one so the benefits are frequently marginal.

I upgraded to Julia 1.3.1 and the speed test is even better now:

  memory estimate:  14.40 GiB
  allocs estimate:  76011754
  minimum time:     16.140 s (11.75% GC)
  median time:      16.620 s (13.68% GC)
  mean time:        16.810 s (14.36% GC)
  maximum time:     17.806 s (16.92% GC)
  samples:          7
  evals/sample:     1

while the python implementation in a heavily optimized library like sklearn is almost 5x slower:

1min 21s ± 2.85 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

I still can’t believe it~ :flushed: :astonished:

That’s great news :slight_smile:
I still think you can squeeze some extra performance out of julia by working on the allocations, even though I have no idea what code you are actually benchmarking :stuck_out_tongue:

I’m cleaning up the code. I will make a post soon for optimization tips since I have never really dealt with low level languages so I have no clue about allocations (but I am eager to learn now) :slight_smile:

Great :slight_smile: When you make such a post, make sure to include self-contained code such that anyone can just copy and paste it and get a timing result for themselves to improve upon. That’s a great way to ensure that you get a lot of people hooked on making the code faster :stuck_out_tongue:


Updated link BenchmarkTools.jl/ at master · JuliaCI/BenchmarkTools.jl · GitHub