Benchmarking a value dependent function

I have a package that, at its core, performs a calculation f(P, x) for various values of x, using various user-defined methods for problem P and inputs x.

The key is that even though all the components are type-stable, the value of x will affect the runtime (not unlike a parametric rootfinding problem will have a harder time finding a root in some locations). But at the same time, learning about the Julia-specific breakdown of runtime (eg gc stats) would benefit the user.

What the user should get out of this is a summary (eg quantiles) for runtimes, gc times, lock conflicts, etc.

Is @timed the best entry point for this, in a setting not unlike

do_timed(f, M, N = 1000) = [(x = randn(M); @timed(f(x))) for _ in 1:N]

?

Since the macro is inside the function, is it correct to assume that if f is type-stable, compile_time and recompile_time will be 0?

Or should I directly invoke time_ns() and Base.gc_num() like BenchmarkTools?

It really depends on what you actually want to show. @timed is likely fine for your purposes, I’d probably prefer it over time_ns and Base.gc_num.

Technically, f itself being type stable might not be enough, since you can have localized instabilities that are β€œrecovered” inside a type stable function. But yeah, that’s more or less right.

One thing to note here, is that @benchmark from BenchmarkTools.jl is often quite good at giving an overview of the stats you’re talking about (though not lock conflicts):

julia> @benchmark sum(rand(N, N)^2) setup=(N = rand(1:100))
BenchmarkTools.Trial: 306 samples with 986 evaluations per sample.
 Range (min … max):  72.999 ns … 92.887 ΞΌs  β”Š GC (min … max): 0.00% … 51.74%
 Time  (median):      8.139 ΞΌs              β”Š GC (median):    4.97%
 Time  (mean Β± Οƒ):   16.637 ΞΌs Β± 18.097 ΞΌs  β”Š GC (mean Β± Οƒ):  7.11% Β±  7.87%

  β–ˆβ–‚                                                           
  β–ˆβ–ˆβ–‡β–ˆβ–ˆβ–…β–„β–…β–ƒβ–ƒβ–„β–„β–‚β–ƒβ–ƒβ–ƒβ–…β–ƒβ–ƒβ–„β–β–β–ƒβ–ƒβ–ƒβ–ƒβ–‚β–ƒβ–ƒβ–‚β–ƒβ–‚β–β–β–‚β–‚β–ƒβ–ƒβ–„β–ƒβ–‚β–„β–ƒβ–ƒβ–ƒβ–ƒβ–„β–ƒβ–„β–ƒβ–β–‚β–β–β–β–‚β–β–β– β–ƒ
  73 ns           Histogram: frequency by time        60.7 ΞΌs <

 Memory estimate: 224 bytes, allocs estimate: 4.