First Pluto notebook launches are slower on Julia 1.9 beta 3

jbrea · January 30, 2023, 9:49am

Last year I did basically this, and shipped the class notebooks with system images that were specified in an Artifacts file. This worked quite well (for all the students that followed the instructions and downloaded the correct version of julia ): the students didn’t need to know anything about environments and everything was pretty fast from the start.

jling · January 30, 2023, 1:57pm

that’s not an excuse to not document I mean, that’s even more reason to document no?

but also, Pluto is not used as a library, so the thing to document is mainly How to and Tips, look at Configuration · Revise.jl

kellertuer · January 30, 2023, 2:35pm

Oh, that is not meant as an excuse, but as an explanation, if there is not enough people contributing to such a documentation. After all, such a package is also a community effort.

jling · January 30, 2023, 6:34pm

IMHO, Pluto is known to be the opposite of that, I mean it neutrally. People like Pluto because its authors have their minds on designs and don’t want Pluto to be anything other than it. This is evident for example:

Pluto has repeatedly urged people NOT to use Github issue for feedback – how are the community supposed to help/discuss the direction if the feedbacks are only available to core devs? e.g. Delete button is too deep · Issue #1310 · fonsp/Pluto.jl · GitHub

Anyway, this is off-topic but my point is, the first step to make something a community effort is to have docs, at least a devs doc? We shouldn’t expect people to find out how the package works by trial and errors AND write docs for a package that’s not even theirs and they don’t have a say in most decisions.

kellertuer · January 30, 2023, 6:47pm

I think there is a difference between the main design and the “small steps” like contributing to the documentation (maybe just the doc strings would already improve here). So my opinion here seems to be different from yours – which is fine of course I think doc-strings and using parts of a package can also be documented by the community. And sure we got a little off-topic here, sorry to the rest for that.

Palli · January 30, 2023, 10:00pm

Should (does?) Pluto.jl have a way to save an optional Manifest file (I’m not sure if it’s always saved with Project or neither)? I mean, to you it seems like an overkill to have the file, and to others it seems like a very good feature to have it by default. I see no downside to include it anyway as an optional feature, and do see value (your view) in having all the latest versions of packages (e.g. for speed), or at least guaranteed to be at least the same or later then in the Manifest file. Then if things seem off you could click on some “reproducibility” button to enable the Manifest file (or some users could have it set by default for themselves).

Last year I did basically this, and shipped the class notebooks with system images that were specified in an Artifacts file

Then it’s for e.g. x86_64, and not e.g. ARM (or WebAssembly, that and Shiny for Python, Shinylive is what we will compete with now, and it’s fast). Or they/Artifacts support “fat binaries”/multi-arch for sysimages? Or could you then have different sysimages for different archs to save on download, and if none supported for your platform does Pluto default to no sysimage?

disberd · January 31, 2023, 8:15am

As of now, if you use packages and don’t disable the pkg manager a manifest is embedded in the notebook together with the project and it’s always used (it’s not optional)

jbrea · January 31, 2023, 8:21am

Yes, I created sysimages for the three most common architectures that students use. For an unknown architecture (less than 5% of students), it doesn’t download a sysimage. These students can use the Pluto notebooks of the course without a sysimage or create one for their architecture by running MLCourse.create_sysimage().

bkamins · February 6, 2023, 7:38pm

I think that a scenario when you share something to a one-time user that prefers to avoid full precompilation because only a simple script needs to be run should be also on our radar.

An idea I had in
this PR comment is that maybe we could have ENV entry that would allow packages to get a signal if precompilation is desired or not. Would something like this make sense?

stevengj · February 6, 2023, 8:11pm

In the long run, the way to handle this will probably be to run things in an interpreter while code compiles in the background. That way you get the benefit of low latency from the interpreter, but still get the benefit of compilation for longer-running tasks.

But I think this is straying from the topic of Pluto notebooks. Personally I would like the option to run all Pluto notebooks within a given environment, rather than having an environment per notebook, so that I can instantiate a single environment for a project with several notebooks, and have the option to run an existing notebook within a specified environment rather than the one cached in the notebook.

visr · February 6, 2023, 8:35pm

Since the Pluto issue is just an instance of the more general issue of working with separate environments combined with high release frequencies of packages (both good things), I wonder what others think about this proposal to add already installed packages when possible:

aplavin · February 6, 2023, 8:48pm

The issue is not specific to Pluto in any way, it’s just a single instance. Everyone who uses environments for many projects/analyses/scripts is affected, because even very similar environments created on different dates differ a lot, and require recompilation — see my quantification of this above.

But… That’s already supported? 🎁 Package management · fonsp/Pluto.jl Wiki · GitHub

It’s not “cache”, it’s just a regular environment. Different projects/analyses need different envs anyway so that they could be reproduced later. Otherwise, with shared env, updating/adding a package for AnalysisA can silently break AnalysisB.

kristoffer.carlsson · February 6, 2023, 9:16pm

That sounds like Pkg.offline.

visr · February 6, 2023, 9:40pm

Great, thanks, that is indeed a big part of it. So more or less what I’m proposing is to default to an offline add, it that fails, do an online add, effectively this:

julia> Pkg.offline(true)

(jl_89Sowl) [offline] pkg> add Example
   Resolving package versions...
ERROR: Unsatisfiable requirements detected for package Example [7876af07]:
 Example [7876af07] log:
 ├─Example [7876af07] has no known versions!
 └─restricted to versions * by an explicit requirement — no versions left

julia> Pkg.offline(false)

(jl_89Sowl) pkg> add Example
   Resolving package versions...
   Installed Example ─ v0.5.3
    Updating `C:\Users\visser_mn\AppData\Local\Temp\jl_89Sowl\Project.toml`
  [7876af07] + Example v0.5.3
    Updating `C:\Users\visser_mn\AppData\Local\Temp\jl_89Sowl\Manifest.toml`
  [7876af07] + Example v0.5.3
Precompiling environment...
  1 dependency successfully precompiled in 1 seconds. 33 already precompiled.

jlperla · February 6, 2023, 9:59pm

The Project.toml and Manifest.toml stuff works exactly as many like. I tell students to instantiate the manifest once to reproduce an enviornment - with the expectation set with them that the instantiation process might be slow as with all other package system like conda/etc. or even a little longer because it needs to do more high-performance compilation - but it is a one-time thing.

But that is a fixed cost, and afterwards it is blindingly fast these days. Students are told not to add in packages to the shared notebook environment haphazardly, but they rarely would anyways. No package operations ever occur unless manually triggered.

Are you sure this isn’t Pluto workflow specific? I haven’t encountered it in either jupyter, VSCode or antyhing else for a long time. If anything reinstalls or rebuilds it is because of a decision I made.

A key feature of Project/Manifest setup is to avoid that sort of thing. The speed of package evolution should be irrelevant since you want a reproducible snapshot. If you aren’t using shared manifests for each project (or set of lecture notes) then I can understand, but that is missing out on an amazing feature. Or maybe the feature already exists in Pluto but for whatever reason people aren’t using it in that way?

mcabbott · February 6, 2023, 10:38pm

Perhaps the quick add via using CSV (and answering ‘y’) should do this, if possible? (And Pluto could follow that.)

While ] add CSV is more deliberate, and could behave as now.

tim.holy · February 6, 2023, 10:44pm

The recommended approach is in PSA for SnoopPrecompile: turning off extra workload for specific packages, and ENV is discouraged.

rikh · February 7, 2023, 6:36am

That’s possible via Pkg.activate(@__DIR__).

If you also want to run all notebooks in the same process, then, although not recommended, you can set use_distributed=false.

aplavin · February 7, 2023, 9:01am

Sure, if you have a single environment that you never modify — no recompilation happens. That doesn’t depend on repl/vscode/pluto at all.

It’s exactly the same in any context, including Pluto.
Nothing recompiles if you don’t do any package operations.
But consider the following simplified scenario (unrelated to teaching/learning). Every week or so, you want to perform some kind of a new analysis that needs some packages and writing some code.
Naturally, you want to:

be able to reproduce these analyses in the future after a few years,
and modify one of them independently in the future without breaking others.

So, following totally sensible Julia recommendations, you create new environments for these projects. Package sets are often similar with some differences here and there. But still, all of these envs require a long precompilation:

when you start a new analysis, because latest package versions changed since last week,
when you update julia/run them on a difference machine/compiled cache gets cleaned-up (it cannot store tens of Gb for all different versions forever).

This makes 1.9 slower in the (arguably common) scenario when lots of small projects/analyses only see a couple of executions. Like, I play with some dataset, make some plots — turns out results aren’t useful for now. In a year, it becomes relevant, so I load the same env + code, everything reproduces exactly as before, I change something and produce a few plots. The end. I paid two recompilations for two code executions (more exactly, two executions in different julia sessions – those without julia restarts don’t count anyway).

Topic		Replies	Views
Julia 1.9.0-beta2 precompilation taking almost twice as long General Usage question , precompilation	38	3215	January 30, 2023
10-15 minute TTFP with Plots.jl... Please help New to Julia ttfp	55	2958	January 9, 2023
Ways to make slow/sluggish REPL/interactive development experience faster? Performance repl , ttfp	35	5627	July 23, 2019
Everything precompiling when opening new notebook Performance question	8	940	July 21, 2022
It takes several minutes to add a single package New to Julia pkg , environment	105	4450	November 8, 2021

First Pluto notebook launches are slower on Julia 1.9 beta 3

Related topics