How to use Manifests for a clean environment for reproducibility

Sloppy use of language on my part. My understanding is that the Manifest.toml file is the file you want to keep for reproducibility, as it lists all the packages used, while the Project.toml file is where you specify your idiosyncratic version needs. The Manifest.toml file is automatically generated, while the Project.toml file can be tweaked by the user. I’ll make sure to clearly distinguish manifest and project in future discussions!

Veering off topic now. I must admit that I’m not sure how you would use the Manifest.toml file to set up a clean environment for reproducibility, since the system overwrites it. Do you copy the content of the Manifest.toml into Project.toml or something?

The normal thing to do for reproducibility is to commit both Project.toml and Manifest.toml to version control and make sure that you run with this environment activated, e.g. by starting a script file with

using Pkg
Pkg.activate(@__DIR__)
Pkg.instantiate()

Obviously you shouldn’t update packages or similar in this environment after you have prepared it, but if you do, your version control will show that and let you undo it.

For reproducibility you also want to make sure that you only depend on registered packages or other stuff within the same repository, i.e. no developed or added local paths outside of your repository. If you’re really serious about reproducibility you will also tweak the load path to exclude the default environment.

4 Likes

I love the standardized package management in Julia and find it rather robust, but I’ve had a similar experience with the version management: packages are often held back to old versions or even downgraded. This seems to happen much more frequently than in other ecosystems and I’d like to know why!

Maybe it’s because Julia’s ecosystem is relatively immature and moving faster. Maybe developers are more casual with making new major (backward incompatible) releases: Suppose A just released version 0.13. You have installed B 1.5 that supports A 0.13. If later you install C that hasn’t been updated and only support A 0.12, it will downgrade both A and B. If A is fast moving, this can happen a lot.

Another possibility is that we see more composition and reuse in Julia’s ecosystem, so packages have more dependencies on other packages by third-party developers, with different release schedules… This composition is great in some ways but also has costs… Maybe together with fast moving packages it can explain some of the difficulties with dependency resolution?

2 Likes

I think there are many reasons. Some that come to mind right away:

Core:

  • Julia takes package compatibility and semantic versioning quite seriously.
  • For multiple dispatch and composability to work well it’s not really feasible to have disjoint dependencies, so the environment has to resolve to a single set of package versions to use.

User side:

  • It’s common to cram too many packages into the default environment instead of using separate environments for different tasks or activate --temp for throwaway tests of packages.

Packaging side:

  • Package authors are sometimes slow to add compatibility with new breaking versions of their dependencies.
  • (Variation) Compat is updated on master but no release is made so the compat update does not become available to the package ecosystem.
  • Compat with the new version is added but the old version is immediately dropped, even if the breaking change wasn’t actually affecting you.
5 Likes

I think this is an important question. Can someone split the topic? @mbauman?

3 Likes

If you’re a little paranoid about packages updating, you can additionally ]pin --all that environment. This stops Pkg from updating a dependency.

3 Likes

Great, thank you! :grinning: