Recommendation for CI benchmark to catch regression

jling · October 7, 2023, 12:28pm

It’s pretty common for Julia package developers to want to monitor key usage latency / performance to avoid regression in PRs. Julia itself has nanosoldier, and some big package such as Makie has their own bot: Hardcode paths for default fonts to avoid search latency by jkrumbiegel · Pull Request #2531 · MakieOrg/Makie.jl · GitHub

Is there a semi-canned solution or workflow people would recommend? I have seen past effort such as: GitHub - maxbennedich/julia-regression-analysis: Regression Analysis for Julia but didn’t seem to be widely used. Maybe it’s just not worth the effort given stability and cost, I think even very popular packages such as DataFrames.jl don’t have automated performance/latency in CI

Krastanov · October 7, 2023, 6:23pm

Maybe benchmark.yml and benchmark-comment.yml would be useful here? E.g. in JET: https://github.com/aviatesk/JET.jl/blob/master/.github/workflows/benchmark.yml providing comments like OptAnalyzer: ignore reports from const-prop when concrete-evaled already by aviatesk · Pull Request #561 · aviatesk/JET.jl · GitHub

These track performance regressions but not latency regressions.

For latency regressions a good proxy might be invalidation tracking: https://github.com/SciML/SymbolicNumericIntegration.jl/blob/main/.github/workflows/Invalidations.yml

Maybe someone can volunteer to add these to the various template package generators…

jling · October 10, 2023, 1:33pm

this is actually super complicated, it needs two CI yaml to do the dance

meanwhile, Makie.jl does the whole GitHub API related thing form inside Julia:

github.com

MakieOrg/Makie.jl/blob/ce5c21411bab77882b317bbeba1b3a204d5d8b78/metrics/ttfp/run-benchmark.jl#L230-L242


      
          results_pr = load_results(basename(project1))

          results_m = load_results(basename(project2))

          benchmark_rows = get_row_values(results_pr, results_m)

          

          pr_to_comment = get(ENV, "PR_NUMBER", nothing)

          

          if !isnothing(pr_to_comment)

              pr = GitHub.pull_request(ctx.repo, pr_to_comment)

              make_or_edit_comment(ctx, pr, Package, benchmark_rows)

          else

              @info("Not commenting, no PR found")

              println(update_comment(COMMENT_TEMPLATE, Package, benchmark_rows))

          end

filchristou · November 14, 2023, 3:24pm

btw, there is also Github-action-benchmark with Julia support and [ANN] AirspeedVelocity.jl - easily benchmark Julia packages over their lifetime

gdalle · November 14, 2023, 5:50pm

Looks unmaintained / experimental but still useful. I think that’s the package used in the workflows linked by Krastanov. But AirspeedVelocity looks ~~better~~ more current.

filchristou · November 14, 2023, 7:56pm

For the sake of completeness, the comment from Krastanov before actually mentioned BenchmarkCI.jl . i.e. JET uses BenchmarkCI.jl
I cannot disagree that it looks unmaintained and experimental, but it works. Moreover, I always found @tkf 's packages very well designed and speak to my personal preference.

MilesCranmer · November 19, 2023, 9:33pm

This is what AirspeedVelocity.jl looks like on a typical PR: Create `AutoFloat` type for units by MilesCranmer · Pull Request #66 · SymbolicML/DynamicQuantities.jl · GitHub

It’s extremely useful for catching regressions:

It’s saved me multiple times from introducing performance regression due to some type instability I didn’t notice. Also very useful for monitoring time-to-load. Basically just copy this file into a workflow: https://github.com/SymbolicML/DynamicQuantities.jl/blob/main/.github/workflows/benchmark_pr.yml. and make sure you have a file benchmark/benchmarks.jl which uses BenchmarkTools to define const SUITE = BenchmarkGroup()

So far it’s used in my repos, and also SymbolicUtils.jl. (Maybe others too that I’m not aware of.)

Compared with BenchmarkCI.jl/PkgBenchmark.jl it’s not as extensive, so probably good to check out both. AirspeedVelocity.jl is basically a re-built version (also uses BenchmarkTools) with a significant emphasis on the command line (because it makes it easier to interface with git, especially if you just want to quickly check for regressions against master or something).

Topic		Replies	Views
Benchmarking tests to ensure PRs don't introduce regressions Performance benchmark , regression	5	1107	January 28, 2018
CI : benchmarking dependencies best practices? Package Management benchmark , ci , dependencies	3	148	May 8, 2025
Compiler performance regression Performance	4	181	February 10, 2025
Performance regression General Usage question	3	766	December 8, 2016
Benchmarking with PkgBenchmark.jl General Usage question , testing , performance	15	3131	January 6, 2018

Recommendation for CI benchmark to catch regression

Related topics