# @time and @benchmark give different results

[Corrected: ms is 10^{-3} seconds and μs is 10^{-6}.]

I’m evaluating 2 algorithms for doing roughly the same thing, using `@time` and `@benchmark`. The ratio of times between the 2 timing methods differs by a factor of roughly 10.

Also the times reported by `@time` are inconsistent with the range of times reported by `@benchmark`. Could anyone help me understand what’s going on?

``````# call this invocation of fit "Method 2" since function call has 2 arguments
julia> @time r2 = RBSilly.fit(RBSilly.yx, 0.0002);
0.043844 seconds (9.14 k allocations: 35.169 MiB, 39.28% gc time)

julia> @time r2 = RBSilly.fit(RBSilly.yx, 0.0002)
0.033817 seconds (9.14 k allocations: 35.169 MiB)

## and call the next "Method 3"
julia> @time r = RBSilly.fit(RBSilly.yx, 0.0002, 4);
0.011290 seconds (7.71 k allocations: 382.953 KiB)
``````

Based on this admittedly small sample Method 2 is around 30-40 ms (milliseconds),
Method 3 is ~ 11 ms, 3 to 4 x faster than Method 2. But

``````# Caution: reversed the order of the 2 methods relative to @time panel
julia> @benchmark RBSilly.fit(RBSilly.yx, 0.0002, 4);

julia> @benchmark RBSilly.fit(RBSilly.yx, 0.0002, 4)
BenchmarkTools.Trial: 6673 samples with 1 evaluation.
Range (min … max):  336.614 μs … 53.935 ms  ┊ GC (min … max): 0.00% … 0.00%
Time  (median):     363.297 μs              ┊ GC (median):    0.00%
Time  (mean ± σ):   744.718 μs ±  1.802 ms  ┊ GC (mean ± σ):  3.36% ± 6.20%

█▃▁
███▆▆████▅▅▄▅▆▆▆▆▇▆▅▆▅▆▅▅▆▅▅▅▄▆▅▅▄▆▅▅▄▄▅▅▆▅▆▃▄▃▃▄▄▄▃▃▄▃▂▄▃▄▃ █
337 μs        Histogram: log(frequency) by time      6.73 ms <

Memory estimate: 348.70 KiB, allocs estimate: 7627.

julia> @benchmark RBSilly.fit(RBSilly.yx, 0.0002)
BenchmarkTools.Trial: 250 samples with 1 evaluation.
Range (min … max):  18.043 ms … 27.407 ms  ┊ GC (min … max):  0.00% … 15.60%
Time  (median):     20.347 ms              ┊ GC (median):    10.53%
Time  (mean ± σ):   20.046 ms ±  1.930 ms  ┊ GC (mean ± σ):   6.35% ±  5.67%
``````

`@benchmark` shows Method 2 is 20 ms and Method 3 is ~ 0.36 ms (or perhaps 0.74 if we use the mean). So Method 3 is ~ 27 - 55 x faster than Method 2. The `@timer` value for Method 2 of 30-40 ms is considerably above the max of 27 reported by `@benchmark`, while the `@timer` value for Method 3 of 11 ms is within `@benchmark`'s wide range of 0.3 - 53 ms, though far from the central estimate.

BTW, Method 3 is supposed to be a shortcut that gets most of the fit with less work than Method 2. But Method 3 does require additional setup, and so it’s unclear which will be faster in practice. Both timing methods seem to agree that Method 3 is faster, though differing in how much faster.

Environment: julia 1.6.2 on 64 bit Intel, Debian GNU/Linux (julia is tarball direct from julia.org)
Running under VS Code with functions defined in a file obtained through `Julia: Execute file in REPL`.
I believe everything had been compiled before I ran the code excerpted above.

millisecond is 10^{-3} and microsecond is 10^{-6}.

1 Like

Thanks @Elrod for catching a silly mistake. I have reworked my original report extensively with that in mind, and things are a lot less weird. But they are still oddly inconsistent.

I suppose the differences could all be random variation, but it’s hard to believe that’s all that’s happening.

Ross

In the case of @benchmark, shouldn’t you quote the first argument?

``````@benchmark RBSilly.fit(\$RBSilly.yx, 0.0002, 4)
``````

Several independent points:

1. Quoting the argument doesn’t seem to affect the results much.
2. In the very first `@benchmark` result I reported the max GC is 0.00, yet the mean and sd are positive!
3. The results are consistent with the theory that `@time` has a fixed overhead of about 11 ms compared to `@benchmark`–adding 11 ms to the results from the latter gives you the former. So maybe that’s what’s going on.

I originally thought I was unlikely to see the problem in a trimmed down example, but 3. seems testable. Unfortunately my first attempt at a MRE was a little too minimal:

``````julia> @time 1
0.000000 seconds
``````

and a simple function call does not produce anything like 0.01 for duration:

``````julia> @time time();
0.000001 seconds
``````

Could be constant propagation, caching, or the GC being GC (sporadic and variable). The display of the max GC looks like a bug? (edit: likely Trial display assumes the max GC comes from the sample with the max time · Issue #248 · JuliaCI/BenchmarkTools.jl · GitHub)

Definitely not a constant overhead to `@time`.

I should add that `@benchmark` isn’t doing much more than `@time` itself. It’s just in a loop and in a function. You can get one step closer by doing something like:

``````function test() = [@timed(RBSilly.fit(RBSilly.yx, 0.0002, 4)) for _ in 1:100]
``````

Note that this also gives you a solid mental model for what that fancy `\$` does:

``````function test_with_dollars(f, a, b, c) = [@timed(f(a, b, c)) for _ in 1:100]
``````

Now doing `test_with_dollars(RBSilly.fit, RBSilly.yx, 0.0002, 4)` is comparable to `@benchmark \$(RBSilly.fit)(\$RBSilly.yx, \$0.0002, \$4)`.