[ANN] PkgJogger.jl: Revise-compatible and boilerplate-free package benchmarking

PkgJogger.jl is a Revise-compatible, boilerplate-free tool for running your package’s benchmarking suite built on top of BenchmarkTools.jl. I’ve been using it for about a year and figured it was time to share it with the community.

Just Write Benchmarks

Throw your benchmarks into a file named benchmark/bench_*.jl, define a suite, and you’re off to the races with @jog PkgName. PkgJogger will wrap each file into a separate module (Think SafeTestsets) and create a module (JogPkgName) for running, saving, loading, and judging your benchmarks.

using BenchmarkTools
using AwesomePkg
suite = BenchmarkGroup()
suite["fast"] = @benchmarkable fast_code()
using AwesomePkg
using PkgJogger

# Creates the JogAwesomePkg module
@jog AwesomePkg

# Warmup, tune, run and save all of AwesomePkg's benchmarks
julia> JogAwesomePkg.benchmark(; save=true)
[ Info: Saved results to .../benchmark/trial/a86f6b29-1bd6-4a9d-b902-c98daf293858.bson.gz
1-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "bench_code.jl" => 2-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "fast" => Trial(1.160 ms)

Results are saved to compressed BSON files in PKG_ROOT/benchmark/trial/UUID.bson.gz, and take roughly ~1/7th the space of *.json files created by BenchmarkTools.save. Plus they contain various metadata: git status, system info, Julia version, and PkgJogger version.

Benchmark, Revise, and Benchmark Again!

Code not quite as fast as you’d like? PkgJogger works with Revise.jl; so whether you forgot to escape a pesky variable or want to tweak your code, PkgJogger will be ready to run the latest version on the next JogAwesomePkg.benchmark()

For example, I edited my package’s code, re-ran the benchmarks, and then used JogAwesomePkg.judge to compare the results. Like most things in PkgJogger, the heavy lifting is performed by BenchmarkTools.jl.

julia> JogAwesomePkg.benchmark(; save=true)
[ Info: Saved results to /Users/alexwadell/.julia/dev/PkgJogger/test/Example.jl/benchmark/trial/c9a0f1c2-d54b-4094-acf8-eb3912f86937.bson.gz
1-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "bench_code.jl" => 2-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "fast" => Trial(3.208 ms)

julia> JogAwesomePkg.judge("c9a0f1c2-d54b-4094-acf8-eb3912f86937", "a86f6b29-1bd6-4a9d-b902-c98daf293858")
1-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "bench_code.jl" => 2-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "fast" => TrialJudgement(+98.58% => regression)

Long Tuning Times?

Like to benchmark but hate waiting for tuning? PkgJogger can reuse the tune from a prior run to partially or entirely skip the tunning step.

julia> JogAwesomePkg.benchmark(; ref="c9a0f1c2-d54b-4094-acf8-eb3912f86937")

Continuous benchmarking Backed In!

Do you want to run your benchmarks as part of your CI? PkgJogger’s got you. JogAwesomPkg.ci() will instantiate a benchmarking project (benchmark/Project.toml), run the suite, and save your results with a one-liner:

julia -e 'using Pkg; Pkg.add("PkgJogger"); using PkgJogger; PkgJogger.ci()

Or, if you use GitHub actions, just add uses: awadell1/PkgJogger.jl@latest to your workflow.

Smoke out Broken Benchmarks

Ever write benchmarks ages ago only to find they no longer work with the current code? @test_benchmarks AwesomePkg will run each benchmark once and make sure they don’t error out. While not a perfect test, @test_benchmarks can serve as a quick smoke test of the benchmarking suite.

julia> using AwesomePkg, PkgJogger

julia> @test_benchmarks AwesomePkg
Test Summary:  | Pass  Total
bench_code.jl  |    1      1
[...]

Plays Nice with Others

Sometimes it’s nice to use BenchmarkTools.jl directly, and skip the PkgJogger workflow. PkgJogger doesn’t add any code to your benchmark/bench_*.jl files, so just include the file (Or includet to use Revise.jl) and use BenchmarkTools.jl to run the suite directly.

Missing some features provided by PkgBenchmarks.jl or BenchmarkCI.jl? PkgJogger can populate the benchmark/benchmarks.jl file for you:

using AwesomePkg
using PkgJogger
@jog AwesomePkg
const SUITE = JogAwesomePkg.suite()
25 Likes

Thanks for the post. I’m trying to understand why I should be using this instead of PkgBenchmarks.jl. Can you please elaborate?

For me, the killer feature is supporting a Revise-based workflow. I’m a heavy user of the SciML ecosystem, so avoiding the TTFX of a new Julia instance (PkgBenchmark launches one for each run) is a huge win for me. But for the sake of completeness:

  • PkgBenchmarks hits the TTFX issue on each run. PkgJogger doesn’t
  • PkgBenchmarks can set CLI flags. PkgJogger can’t.
  • PkgJogger doesn’t require a clean git state. PkgBenchmarks does.
  • PkgJogger manages saving results automatically. PkgBenchmarks doesn’t.
  • PkgBenchmarks can generate Markdown reports. PkgJogger can’t.
  • PkgJogger recursively adds benchmarks from bench_*.jl files. PkgBenchmarks requires everything to be defined in one file.
  • PkgJogger wraps bench_*.jl files in their own module like SafeTestsets. PkgBenchmarks doesn’t.
  • PkgJogger uses BSON and doesn’t mangle keys. PkgBenchmarks uses JSON and mangles keys. i.e. 1 becomes "1".
  • PkgJogger supports partially re-tuning a suite. PkgBenchmarks doesn’t, it’s all or nothing.
  • PkgJogger has the @test_benchmarks macro. PkgBenchmarks doesn’t have an equivalent.
  • PkgJogger has a one-liner / GitHub action for running benchmarks. PkgBenchmarks relies on BenchmarkCI.
  • With BenchmarkCI, PkgBenchmarks can upload results to GitHub. PkgJogger relies on other GitHub actions for this.

For example, PkgJogger can do this:


julia> using AwesomePkg, PkgJogger

julia> @jog AwesomePkg

# Git State can be clean or dirty.
julia> JogAwesomePkg.benchmark(;save=true);
[ Info: Saved results to [...]/HEAD_UUID.bson.jz

shell> git checkout main

# Revise handles the code changes, and we reuse the latest 
# results for tunning.
julia> JogAwesomePkg.benchmark(;save=true, ref=:latest);

[ Info: Saved results to [...]/MAIN_UUID.bson.jz

# Replace `HEAD_UUID` and `MAIN_UUID` with 
# the UUIDS reported above
julia> JogAwesomePkg.judge("HEAD_UUID", "MAIN_UUID")
[...]

shell> git checkout another-branch

# Revise handles the code changes, and we spec the UUIF
# of the run to get the tune from
julia> JogAwesomePkg.benchmark(;save=true, ref="HEAD_UUID");
[ Info: Saved results to [...]/ANOTHER_UUID.bson.jz

# Replace `HEAD_UUID` and `ANOTHER_UUID` with the 
# UUIDS reported above
julia> JogAwesomePkg.judge("HEAD_UUID", "ANOTHER_UUID")
[...]

Whereas in PkgBenchmarks, it would be:


julia> using AwesomePkg, PkgBenchmarks

# Git State **must** be clean.
julia> r1 = benchmarkpkg(AwesomePkg);

# Launches new julia process, re-compiles everything.
julia> r2 = benchmarkpkg(AwesomePkg, "main");

julia> PkgBenchmarks.judge(r1, r2)
[...]

# Launches new julia process, re-compiles everything.
julia> r3 = benchmarkpkg(AwesomePkg, "another-branch");

julia> PkgBenchmarks.judge(r1, r3)
[...]

To PkgBenchmarks credit, the issue with mangling keys is due to how BenchmarkTools serializes its types to JSON. Aside from committing type piracy, using a different serializer, or changing BenchmarkTools, there’s not really a way around this issue.

1 Like