I run code on a cluster (using slurm). The sbatch file starts Julia in the package directory. For example:
julia --project="." --startup-file=no "longleaf/run_simple.jl"
Occasionally, the jobs crash with errors complaining that dependencies are not installed (even though they were when I ran the same code an hour ago). An example error message:
ERROR: LoadError: ArgumentError: Package Optim [429524aa-4258-5aef-a3af-852621145aeb] is required but does not seem to be installed:
- Run `Pkg.instantiate()` to install all recorded dependencies.
Stacktrace:
[1] _require(::Base.PkgId) at ./loading.jl:993
[2] require(::Base.PkgId) at ./loading.jl:922
[3] require(::Module, ::Symbol) at ./loading.jl:917
[4] include at ./boot.jl:328 [inlined]
[5] include_relative(::Module, ::String) at ./loading.jl:1105
[6] include(::Module, ::String) at ./Base.jl:31
[7] top-level scope at none:2
[8] eval at ./boot.jl:330 [inlined]
[9] eval(::Expr) at ./client.jl:425
[10] top-level scope at ./none:3
[...]
I have tried to work around this by running Pkg.instantiate() at the very top of the Julia file that slurm calls (here: run_simple.jl). But this does not seem to work, perhaps because my package is used before the Julia file is run (by the --project switch).
My guess would be that you have an old Manifest.toml in your project’s directory. Since Pkg.instantiate() does not update the Manifest.toml afaik, it will in this case install an outdated dependency tree (e.g. corresponding to an earlier version of the Projects.toml).
To avoid this, try to run Pkg.resolve() before Pkg.instantiate(). This rebuilds your Manifest according to potentially changed dependencies in Project.toml first and then installs the dependencies.
Note: This might, however, not be what you want if you want to make sure that the packages installed on the cluster are exactly identical to the ones you are using locally on a laptop. In this case make sure to commit an up-to-date Manifest.toml (along with the Project.toml and the other julia files) into a git repo, which you use to keep the calculations on the cluster in sync with your laptop for example. That is my typical workflow .
I upload Manifest.toml together with all the .jl files before each run. So my understanding is that Pkg.resolve() is therefore not needed (?). In fact, I may not want to run resolve() to keep dependency version unchanged (as the second part of your answer suggests).
What I’m wondering is: when I start Julia with --project=".", does it implicitly run using MyPackage before instantiate() or resolve() ever have a chance to run?
If locally the Manifest.toml is in line with your Project.toml and the Project.toml contains all dependencies you require in your code, than what you describe should work. Maybe, however, you are using a package, which is not contained in the Manifest.toml. Could you check if you have a reference to Optim in your Manifest.toml to rule that out?
To my understanding --project="." is equivalent to plainly starting julia and then calling Pkg.activate("."). So no using of your package.
Here is where truncating the stacktrace comes back to bite me.
I am not directly using Optim; it only appears as a dependency of NLopt (which is in Project.toml). Optim does, however, appear in Manifest.toml.
So --project="." doesn’t instantiate the Manifest? It only activates the dir project?
Then I guess that the workflow would be julia --project=. myscript.jl, and within myscript.jl the first line (before any other using/import) would have to be using Pkg; Pkg.instantiate()