First Pluto notebook launches are slower on Julia 1.9 beta 3

  1. for local reproducibility.
  2. because you might want to use not exactly the same packages but, say, one extra package.

(Stacking environments doesn’t solve this either because it breaks 1) and often triggers re/pre-compilation)

1 Like

Yeah, funny :rofl:. But truly, asking for a lecture code (a notebook or else) to be reproducible only exactly with the set of packages with which it was created, ignoring that packages got bug fixes and feature upgrades is not reasonable. This is not what we ask for packages, which have much more flexible compatibility requirements.

What I’m imagining there is that “instantiating” a notebook should be similar to installing all packages with appropriate compatibility. Which is what happens when one adds the packages one by one by hand and there are compat entries in the Project.toml. And this is less demanding for the proliferation of package versions installed, and compilation, etc.

As a general rule, only the latest non-breaking versions of each package should be installed, unless in very specific contexts where exact reproducibility is relevant, like when chasing bugs.

3 Likes

You’re conflating different needs.

There are cases where full reproducibility is not necessary, in which case only the project file with appropriate compat bounds is sufficient, to allow updating packages, and so on. This is an absolutely fine situation. This is probably your need, but not the only situation.

Then there are other cases where one does not want to upgrade packages for absolutely no reason, unless explicitly requested, perhaps because there is a pipeline which is known to work with a specific set of packages and you can’t afford randomly updating a dependency which would behave in a slightly different way breaking your pipeline without much testing. You probably think I’m exaggerating, but this is precisely what happened a few weeks ago when the PkgServer broke because an updated dependency disrupted the pipeline: Lock `nginx-certbot` image to known-good versions · JuliaPackaging/PkgServer.jl@9354aa5 · GitHub (this was outside of the Julia ecosystem, but that’s beside the point). In production runs like this, reproducibility isn’t overrated but vital. One would hope that new versions of packages only get better, but unfortunately that’s not necessarily true and mistakes can happen, in production people can’t afford taking the risk of updating packages nilly willy because you don’t care about reproducibility.

6 Likes

What I’m saying is that thar is by far the most common scenario. And particularly that of lectures. Exact reproducibility is of course useful and vital to have in other contexts, but not for the everyday user, student, etc. So maybe, the approach to package management in some common scenarios, such as everyday environments and Pluto notebooks could be less restrictive, which may provide some benefit in the context of what is raised in this thread.

2 Likes

I was a fan of Pluto before, and considered this package as a distinct feature in julia ecosystem, until one day I found I need to re-install every package every time I run a notebook.

I think this is the result of chasing reproducibility. For me, the charm of Pluto was that after you modified a cell, the other related cells get updated automatically, which means you don’t need to execute them by your self and maybe forget one of them. This really reduces the mental burden and makes the analysis of output extremely smooth.

Reproducibility is a good idea, but like any other good idea, when you push it to extreme, it is not that good anymore. (If we really want reproducibility, then the next step would be installing the specified version of julia, and the libraries it uses…but I just want to run a simple notebook!)

9 Likes

That’s all well and good until you’re teaching a 40-person class and in the middle of class the notebook they’re using suddenly stops working. Why? Who knows? Some dependency flubbed a supposedly compatible update and broke everything. Or maybe some other package was depending on some internal function that got changed or removed. Do you really want to be dealing with that in the middle of class?

7 Likes

The thing is Julia already supports reproducibility without any hassle, so if this is causing problems in Pluto, that’s a problem with how Pluto is using environments, not with reproducibility as a concept. Why does Pluto need to reinstall everything every time a notebook is started?

7 Likes

@StefanKarpinski If I create two different environments with exactly the same packages dependencies, do they share/reuse the same compiled/cached code or is precompilaton redone for both just because they are different environments?

More in general, is there any post or other place where one can read/learn a bit better on how compiled code is reused between different environments so that one can understand better how to make Pluto exploit this functionality as much as possible with its automatic package management?

That happens even with manifest files because students upgrade stuff, change stuff. Both compat entries or manifest files allow one to start over.

(I never suggested everyone in the class share the same package repo).

Ps. I do teach with Julia for classes that size. My approach is to make the course a package. They install the package, which installs it’s dependencies, and things usually work well. A manifest file is an overkill. If Julia usability was that sensible to package versions (happily the current state is that it is not!) the classes could not be much different from just reading a static blog.

1 Like

If there’s a manifest file checked in then there’s a known good version to go back to. Otherwise you’re just out of luck.

In that case all the package devs out there deserve massive kudos because they are doing a great job. Glad that you feel confident enough to rely on it.

7 Likes

I think they do. And your example above that gets running an example witn graphics and such in a few minutes is another example of that. Julia is working very well on that front (easy of package installation and get a functional working environment).

I will only try to express myself more clearly about what I said: maybe having a thing like environments that instead of saving exact package versions are based on compat entries can be useful to minimize the accumulation of package versions and necessity of different compiled codes.

Now if that serves to anything from a technical point of view, I don’t know. Maybe in the context of multiple Pluto notebooks it does. Maybe more. Maybe it doesn’t.

1 Like

If the package and all its dependencies are all the same version then the precompile files can be reused. If any dependency changes then you need to recompile.

Then reopening the same notebook should not in principle reinstall and recompile packages.
Any tips on what to look to see if reopening the same notebook actually triggers re-install or precompilaton?

I don’t know, not a Pluto dev. But I think Pkg ops print output in the terminal if you started it there.

1 Like

Thanks, but I know a bit about Pluto so I meant more on the standard Julia side how to know if things are being reinstalled or not, which from your answer I gather is just checking the output of the package REPL

This is not true! You can easily test this by running a notebook two times.

This concern is also answered in our FAQ: 🎁 Package management · fonsp/Pluto.jl Wiki · GitHub

You can also use our new Status tab to understand the launch process better: 🩺 Process status by fonsp · Pull Request #2399 · fonsp/Pluto.jl · GitHub

You can also see this in our source code: when we open a notebook, we extract the embedded Project/Manifest files and write them to a temporary folder, and then we just call Pkg.instantiate in that folder. Pluto just wraps around Pkg, so all of its benefits (like caching, parallel precompilation, pkgimages) apply to Pluto notebooks as well.

23 Likes

would be better if Pluto has a full documentation and users can see those on the (Documenter.jl) side panel

6 Likes

Thanks for giving the information. The last time I used Pluto was more than one year ago, so it must change a lot since then. I remember that for a new notebook I needed to install the packages I wanted to use. I hope this point has been improved.

I wonder how Pkg garbage collection come into play here. The Pluto environments are temporary Pkg envs, are they deleted every time Pkg.gc() is called (which is called automatically by Pkg operations like update)?

1 Like

I do understand your request, but writing a full documentation also takes a lot (a lot a lot of time),
and maybe during development, these might also change rapidly, so that is even more effort to document.
Maybe as a community together we could start documenting the parts that are considered stable in Pluto by now?

1 Like