How does the PkgEval badge work on JuliaHub?

Could someone clarify the details of the “PkgEval” badge on JuliaHub?

  • When does this badge get updated? It does not seem to be updated every time I push to the master branch of my package. Is it on some timed interval, or does it only get updated when there’s a new release for a package?
  • What version of the package is being evaluated? It appears that if I go to older package versions, I get a history of PkgEval badges. On the other hand, I’ve seen the badge for DocumenterCitations turn from red to green without me making a new release. Is it testing the last release, or the current master?
  • How can I access the logs? I don’t seem to be able to find them in the JuliaHub UI. I know the logs for building the documentation are hiding behind the circled “i” after the package name, but that doesn’t seem to include PkgEval (unless I’m blind)
  • Is there a Github workflow action for checking the PkgEval status of my package? I can probably put something together based on the PkgEval Readme, but it would be neat to have a single action that does “whatever JuliaHub” does (which may change).
1 Like

I think PkgEval is run every day. The reports are here: https://github.com/JuliaCI/NanosoldierReports/tree/master/pkgeval (there you also see which version is run. It is usually the latest). See also GitHub - JuliaCI/NanosoldierReports: A repository for human-readable reports generated by the Nanosoldier.jl CI system. for information about the badge.

PkgEval is run automatically by a bot every night on the latest Julia master branch. See improve `Union` docs as discused in discourse (#45109) · JuliaLang/julia@5ae88f5 · GitHub for an example invocation.

It just goes through the Pkg resolver so it should use the latest released version unless there are incompatibilities with the Julia version. The badge likely turned green again because the issue causing the failure got fixed on Julia nightly.

The latest report is available at https://juliaci.github.io/NanosoldierReports/pkgeval_badges/report.html or you can look for them by date in https://github.com/JuliaCI/NanosoldierReports/tree/master/pkgeval/by_date if you’re looking for a specific report. Just clicking on the badge for a particular package should also direct you straight to the relevant logs.

JuliaHub just serves the badges provided by NanoSoldier. If you want to add it to your own README, you can find instructions under GitHub - JuliaCI/NanosoldierReports: A repository for human-readable reports generated by the Nanosoldier.jl CI system..

Running PkgEval.jl yourself is probably not necessary. You should catch most issues by just adding Julia nightly to your CI matrix. There might be some issues specific to PkgEval that don’t show up on CI since it is using a debug build and running under RR on dedicated infrastructure. If that is the case though, the underlying issue is most likely with Julia itself and you should open an issue or you are doing something that’s unsafe.

2 Likes

Tip: (on Safari at least), you can’t search for a package in the HTML-ish report view on github if your package is inside one of the <details> sections (which will be the case if it’s failed…:)) . You’ll probably find it if you search the Code view.

1 Like

The badge likely turned green again because the issue causing the failure got fixed on Julia nightly.

The reason it was failing before is that I didn’t have stdlib dependencies (like Pkg) listed in test/Project.toml. Maybe this was changed on Nightly because of the whole discussion around whether stdlib compat bounds need to be declared in the main Project.toml file for auto-merging in General (to the extent that I followed that discussion).

Just clicking on the badge for a particular package should also direct you straight to the relevant logs.

That’s actually what I was expecting, but instead, clicking on the badges does “Copy markdown link”. I’d very very much like clicking on the badges to take me the logs instead, so consider this a feature request :wink:

For packages that haven’t opted out, it would also be good to have a badge that links to the logs for the documentation build. They’re pretty hard to find hiding behind the :information_source:.

Running PkgEval.jl yourself is probably not necessary.

Well, if it’s testing only the last release, it would be good to check it before I make a release. And I’m finding that PkgEval is very idiosynchratic (and undocumented) in how it runs the test suite, so “Testing on Nightly” would not catch issues with PkgEval, see below.

You should catch most issues by just adding Julia nightly to your CI matrix.

I actually don’t think anyone should have Julia nightly in the normal CI matrix. I used to, and it was breaking all the time for reasons completely unrelated to my package. One of the main reasons was that some packages (like JET) exclude nightly from their compat bounds, so I couldn’t even instantiate my test/Project.toml.

Given that it’s mostly “noise”, it feels like a waste of significant compute power / carbon for the entire ecosystem to routinely test against Nightly. It’s also a problem that it’s hard for Github workflows to not fail the entire pipeline. I wouldn’t want to have “X” marks on all my commits for jobs that should be “optional”/“advisory”. Granted, that’s a Github problem, and maybe I just gave up too soon trying to find a solution for that.

I might still want to run a check for PkgEval on release branches (but definitely not on every commit on master, or every PR).

In any case, just testing against Nightly wouldn’t help much with PkgEval. The way it runs the test suite is different from how I normally run my test suite (instantiate test/Project.toml, dev-install the current project, include test/runtests.jl). I think PkgEval is using something like the “canonical” Pkg.test() after activating the main Project.toml file. However, I’ve seen Pkg.test() run through when PkgEval fails. As far as I could tell, for DocumenterCitations it was because the “normal” Pkg.test() runs within the current git checkout (so instantiating Documenter.Document() doesn’t fail from not knowing the git remote of the current project), whereas PkgEval seems to run in a directory that is not a git checkout.

Bottom line: please document how exactly PkgEval runs the test suite (and a workflow action would probably be a good place to do that, respectively, make it easy to reproduce)

You can run CI only when new tags are pushed. Re nightly having frequent breakages – I don’t see how using PkgEval for the tests would change that.

Sure, more documentation would always be nice, but it’s literally just a plain Pkg.test("pkgname") call. This is the de-facto standard way of testing across the ecosystem and can easily be ran locally. You can of course use whatever setup you like for testing, but that’s not something PkgEval can be expected to support.

Pkg.test() will look for the package corresponding to the currently activated environment, which happens to be a cloned git repository in your case. You can’t assume that the package directory is tracked by git in general. If you want to test this you can just activate a temporary environment, add your package via Pkg.add and then try Pkg.test("pkgname").

In the end though, PkgEval is a tool meant for Julia development and assessing the effect of changes in base across the ecosystem. Help from package developers in tracking down issues is always appreciated, but keep in mind that failures are expected and tracking these down is just one part of the Julia release process.

2 Likes

I agree with everything you say and I am wondering, why the badge exists at all? It is not as if people are checking their badges daily to find out whether the package is broken on master. Or is this the intention? Then I should check mine more often.

Maybe a bit of a tangent, but I’m somewhat ambivalent about the existence of this badge, too.

On my least charitable days, I feel like “someone wrote some tool to evaluate my package, and it fails for some non-transparent reason. Seems like their problem”. On the other hand, I love what JuliaHub is trying to do here. The fact that new versions of Julia get evaluated against the entire ecosystem to protect against breaking changes is absolutely wild.

In principle, I would like my packages to pass PkgEval and be included in that check, if at all possible. It’s just that PkgEval is still a little bit too much of a black box, so it’s not always trivial to do that.

The other reason I’m ambivalent about the badge is the SEO that ranks JuliaHub very high in Google when someone searches for a package name. Which means that if someone goes there, perceives it as an authoritative page for the package, and then sees a red/failing badge, it makes it look like there’s something wrong with the package. Not great. But also, as a package author, I definitely would like to keep track of whether my package passes PkgEval or not.

I’m not sure what the ideal solution is. Maybe instead of a badge, there should be a bot opening issues in the underlying repository. (But again, I’m not sure if that would go over well especially with new package authors. They might feel like “Who is this PkgEval, and why is this my problem?”)

Or maybe it would be good if packages could opt out of the badge. I managed to get PkgEval to work for DocumenterCitations, but it’s still failing for my packages in the JuliaQuantumControl org. These have rather complicated test suites (including dynamically downloading test data). I’ll look into whether I can fix them at some point, but in general, there might just be packages where the author decides that PkgEval just can’t handle their test suite. In that case, it would be better to have a green PkgEval: opt out or PkgEval: n/a badge instead of a red PkgEval:fail badge.

I agree with you. The problem with the badge is that it gives the impression of being some seal of quality, which it is not. Playing nice with julia master is not a sign of quality. It can also change from day to day without the author being aware or being at fault.

PS: If your tests take too long for the automatic PkgEval run, you can check for the environment variable JULIA_PKGEVAL and adjust the tests (like we do here: https://github.com/thofma/Hecke.jl/blob/0fd09c1f32fa27278a367103bea7587a81603b4b/test/runtests.jl#L30)

1 Like

It’s important to note that the PkgEval infrastructure has nothing to do with JuliaHub. JuliaHub simply displays a badge

but it’s literally just a plain Pkg.test("pkgname")call.

Thanks! That really helps!

This is the de-facto standard way of testing across the ecosystem and can easily be ran locally.

That’s actually the more general problem (unrelated to PkgEval): I’m not sure that actually is the de-facto standard. It feel like the “standard” (presumably defined in the Pkg documentation) is rather unspecificed, and this might be something to improve on. I feel like all of the following are pretty common:

  • run Pkg.test() (I think that’s what the Pkg documentation tells people to use, so that’s what I might consider the most “canonical” one)
  • use TestEnv, and include("test/runtests.jl")
  • use the GitHub - julia-actions/julia-runtest: This action runs the tests in a Julia package. action (which has its own test_harness that may subtly influence the testing environment)
  • Custom creation of a test environment (in a CI script/Makefile), with include("test/runtests.jl") at the end (what I use when testing locally)
  • use Pkg.test("pkgname") (this one, which PkgEval happens to use, is the one I’d think of least, so I’m a bit skeptical on how “canonical” it really is)

All of these are pretty similar and give the same results for many packages, but they all behave differently in subtle edge case, since they differ in how the (sandbox) environment is created. I’m not sure that it’s fair to assume that people can easily navigate around these subtle differences. So, this is a long-term call for documenting these things more clearly :wink:

As for PkgEval, just putting in the documentation that is runs Pkg.test("pkgname") would pretty much solve the problem, as far as I’m concerned.

For future reference: I ended up with a PkgEval.yml workflow that runs as part of the release process.

2 Likes