Last year I did basically this, and shipped the class notebooks with system images that were specified in an Artifacts file. This worked quite well (for all the students that followed the instructions and downloaded the correct version of julia ): the students didnāt need to know anything about environments and everything was pretty fast from the start.
thatās not an excuse to not document I mean, thatās even more reason to document no?
but also, Pluto is not used as a library, so the thing to document is mainly How to and Tips, look at Configuration Ā· Revise.jl
Oh, that is not meant as an excuse, but as an explanation, if there is not enough people contributing to such a documentation. After all, such a package is also a community effort.
IMHO, Pluto is known to be the opposite of that, I mean it neutrally. People like Pluto because its authors have their minds on designs and donāt want Pluto to be anything other than it. This is evident for example:
- Pluto has repeatedly urged people NOT to use Github issue for feedback ā how are the community supposed to help/discuss the direction if the feedbacks are only available to core devs? e.g. Delete button is too deep Ā· Issue #1310 Ā· fonsp/Pluto.jl Ā· GitHub
Anyway, this is off-topic but my point is, the first step to make something a community effort is to have docs, at least a devs doc? We shouldnāt expect people to find out how the package works by trial and errors AND write docs for a package thatās not even theirs and they donāt have a say in most decisions.
I think there is a difference between the main design and the āsmall stepsā like contributing to the documentation (maybe just the doc strings would already improve here). So my opinion here seems to be different from yours ā which is fine of course I think doc-strings and using parts of a package can also be documented by the community. And sure we got a little off-topic here, sorry to the rest for that.
Should (does?) Pluto.jl have a way to save an optional Manifest file (Iām not sure if itās always saved with Project or neither)? I mean, to you it seems like an overkill to have the file, and to others it seems like a very good feature to have it by default. I see no downside to include it anyway as an optional feature, and do see value (your view) in having all the latest versions of packages (e.g. for speed), or at least guaranteed to be at least the same or later then in the Manifest file. Then if things seem off you could click on some āreproducibilityā button to enable the Manifest file (or some users could have it set by default for themselves).
Last year I did basically this, and shipped the class notebooks with system images that were specified in an Artifacts file
Last year I did basically this, and shipped the class notebooks with system images that were specified in an Artifacts file
Then itās for e.g. x86_64, and not e.g. ARM (or WebAssembly, that and Shiny for Python, Shinylive is what we will compete with now, and itās fast). Or they/Artifacts support āfat binariesā/multi-arch for sysimages? Or could you then have different sysimages for different archs to save on download, and if none supported for your platform does Pluto default to no sysimage?
Should (does?) Pluto.jl have a way to save an optional Manifest file (Iām not sure if itās always saved with Project or neither)?
As of now, if you use packages and donāt disable the pkg manager a manifest is embedded in the notebook together with the project and itās always used (itās not optional)
Or could you then have different sysimages for different archs to save on download, and if none supported for your platform does Pluto default to no sysimage?
Yes, I created sysimages for the three most common architectures that students use. For an unknown architecture (less than 5% of students), it doesnāt download a sysimage. These students can use the Pluto notebooks of the course without a sysimage or create one for their architecture by running MLCourse.create_sysimage()
.
I think that a scenario when you share something to a one-time user that prefers to avoid full precompilation because only a simple script needs to be run should be also on our radar.
An idea I had in
this PR comment is that maybe we could have ENV
entry that would allow packages to get a signal if precompilation is desired or not. Would something like this make sense?
I think that a scenario when you share something to a one-time user that prefers to avoid full precompilation because only a simple script needs to be run should be also on our radar.
In the long run, the way to handle this will probably be to run things in an interpreter while code compiles in the background. That way you get the benefit of low latency from the interpreter, but still get the benefit of compilation for longer-running tasks.
But I think this is straying from the topic of Pluto notebooks. Personally I would like the option to run all Pluto notebooks within a given environment, rather than having an environment per notebook, so that I can instantiate a single environment for a project with several notebooks, and have the option to run an existing notebook within a specified environment rather than the one cached in the notebook.
Since the Pluto issue is just an instance of the more general issue of working with separate environments combined with high release frequencies of packages (both good things), I wonder what others think about this proposal to add already installed packages when possible:
Could Pkg.add perhaps add the latest compatible locally installed version? I want add to add things a.s.a.p., and update can take time to update. I use separate environments for different scripts, and often use packages with large dependency trees like ModelingToolkit and Makie. When adding these packages to a new environment because I want to quickly try something, I almost always have to wait a long time since usually some of their deps have updated since my last install and it needs to precoā¦
But I think this is straying from the topic of Pluto notebooks.
The issue is not specific to Pluto in any way, itās just a single instance. Everyone who uses environments for many projects/analyses/scripts is affected, because even very similar environments created on different dates differ a lot, and require recompilation ā see my quantification of this above.
Personally I would like the option to run all Pluto notebooks within a given environment
Butā¦ Thatās already supported? š Package management Ā· fonsp/Pluto.jl Wiki Ā· GitHub
rather than the one cached in the notebook.
Itās not ācacheā, itās just a regular environment. Different projects/analyses need different envs anyway so that they could be reproduced later. Otherwise, with shared env, updating/adding a package for AnalysisA can silently break AnalysisB.
Could Pkg.add perhaps add the latest compatible locally installed version?
That sounds like Pkg.offline
.
Great, thanks, that is indeed a big part of it. So more or less what Iām proposing is to default to an offline add, it that fails, do an online add, effectively this:
julia> Pkg.offline(true)
(jl_89Sowl) [offline] pkg> add Example
Resolving package versions...
ERROR: Unsatisfiable requirements detected for package Example [7876af07]:
Example [7876af07] log:
āāExample [7876af07] has no known versions!
āārestricted to versions * by an explicit requirement ā no versions left
julia> Pkg.offline(false)
(jl_89Sowl) pkg> add Example
Resolving package versions...
Installed Example ā v0.5.3
Updating `C:\Users\visser_mn\AppData\Local\Temp\jl_89Sowl\Project.toml`
[7876af07] + Example v0.5.3
Updating `C:\Users\visser_mn\AppData\Local\Temp\jl_89Sowl\Manifest.toml`
[7876af07] + Example v0.5.3
Precompiling environment...
1 dependency successfully precompiled in 1 seconds. 33 already precompiled.
The issue is not specific to Pluto in any way, itās just a single instance.
The Project.toml
and Manifest.toml
stuff works exactly as many like. I tell students to instantiate the manifest once to reproduce an enviornment - with the expectation set with them that the instantiation process might be slow as with all other package system like conda/etc. or even a little longer because it needs to do more high-performance compilation - but it is a one-time thing.
But that is a fixed cost, and afterwards it is blindingly fast these days. Students are told not to add in packages to the shared notebook environment haphazardly, but they rarely would anyways. No package operations ever occur unless manually triggered.
Are you sure this isnāt Pluto workflow specific? I havenāt encountered it in either jupyter, VSCode or antyhing else for a long time. If anything reinstalls or rebuilds it is because of a decision I made.
Since the Pluto issue is just an instance of the more general issue of working with separate environments combined with high release frequencies of packages (both good things),
A key feature of Project/Manifest setup is to avoid that sort of thing. The speed of package evolution should be irrelevant since you want a reproducible snapshot. If you arenāt using shared manifests for each project (or set of lecture notes) then I can understand, but that is missing out on an amazing feature. Or maybe the feature already exists in Pluto but for whatever reason people arenāt using it in that way?
That sounds like
Pkg.offline
.
Perhaps the quick add via using CSV
(and answering āyā) should do this, if possible? (And Pluto could follow that.)
While ] add CSV
is more deliberate, and could behave as now.
The recommended approach is in PSA for SnoopPrecompile: turning off extra workload for specific packages, and ENV
is discouraged.
Personally I would like the option to run all Pluto notebooks within a given environment, rather than having an environment per notebook
Thatās possible via Pkg.activate(@__DIR__)
.
If you also want to run all notebooks in the same process, then, although not recommended, you can set use_distributed=false
.
Students are told not to add in packages to the shared notebook environment haphazardly, but they rarely would anyways. No package operations ever occur unless manually triggered.
Sure, if you have a single environment that you never modify ā no recompilation happens. That doesnāt depend on repl/vscode/pluto at all.
Are you sure this isnāt Pluto workflow specific? I havenāt encountered it in either jupyter, VSCode or antyhing else for a long time. If anything reinstalls or rebuilds it is because of a decision I made.
Itās exactly the same in any context, including Pluto.
Nothing recompiles if you donāt do any package operations.
But consider the following simplified scenario (unrelated to teaching/learning). Every week or so, you want to perform some kind of a new analysis that needs some packages and writing some code.
Naturally, you want to:
- be able to reproduce these analyses in the future after a few years,
- and modify one of them independently in the future without breaking others.
So, following totally sensible Julia recommendations, you create new environments for these projects. Package sets are often similar with some differences here and there. But still, all of these envs require a long precompilation:
- when you start a new analysis, because latest package versions changed since last week,
- when you update julia/run them on a difference machine/compiled cache gets cleaned-up (it cannot store tens of Gb for all different versions forever).
This makes 1.9 slower in the (arguably common) scenario when lots of small projects/analyses only see a couple of executions. Like, I play with some dataset, make some plots ā turns out results arenāt useful for now. In a year, it becomes relevant, so I load the same env + code, everything reproduces exactly as before, I change something and produce a few plots. The end. I paid two recompilations for two code executions (more exactly, two executions in different julia sessions ā those without julia restarts donāt count anyway).