There’s no second time if there’s no first time to start with.
The Pluto’s approach to package management was especially bad about this, where every new notebook will almost invariably end up hitting precompilation for the project.
My recommendation, regardless of whether pkgimages are used or not, would be to set a toggle which defaults to disabling pre-compilation of Pluto notebook environments which use Pluto’s automatic package managment system.
the assumption there being that since these are essentially scripts, precompilation is a waste of time since it only really benefits re-running the notebook, and imposes a hefty penalty the first time the notebook is run.
I believe that precompilation in Julia 1.8 makes startups faster, not slower, because precompilation is parallelized.
Depends. You’re still usually going to be precompiling a lot more method signatures than need to be regularly jit compiled unless the notebook is quite big and intensive.
E.g. if you’re precompiling all of Plots.jl just for a little lineplot with a slider, then it doesn’t matter if it’s parallel, you’re still wasting a lot of time. Especially since as you say, the usecase you care about is students who tend to have weaker laptop CPUs (I.e, fewer cores)
Although, I haven’t tested this extensively so if you have evidence saying otherwise for small notebooks on weak CPUs, ignore me.
That’s a good suggestion for new notebooks! It does not affect launch times for new users though. We already have some tricks to avoid fetching the registry as often.
Ah, that’s good to know! I just mentioned registry updates because I had issues on really slow internet connections before, but for most users I’d suspect this not to be that big of an issue anymore.
I think mainly it would be great if there was a way to easily create a startup package as described by @tim.holy in Question about using SnoopPrecompile - #9 by tim.holy for use by new Pluto packages, but which would also fix all recursive dependencies to the precompiled versions until a user chooses to manually update them.
That’s a cool trick! This sounds less useful to first-time Julia users though, right? By first I really mean “first day, let’s try out Julia!”
I think the use case I had in mind and previously talked with @alanedelman about was in a classroom setting, where it wouldn’t be a big issue to tell students at the start of the semester to run these commands and just wait a bit if that meant using Pluto was always supper snappy from there on out.
How about a package which creates template notebook environment that is then copied when someone creates a new notebook?
This template notebook environment could be precompiled during the build phase of the PlutoBasicScienceTemplate.jl package and then reused. Thus students could be instructed to just execute the following commands.
using Pkg
pkg"add Pluto PlutoBasicScienceTemplate"
The default template to use could be configured via Preferences.jl or selected as argument to Pluto.run()
. I find that users may tolerate a one time install that runs long, as long as they do not incur this cost repeatedly.
Combined with the Pkg.offline
option above, this shift precompilation times to a one time upfront cost rather than a per notebook cost.
At some point, julia could have a Registry of environments for which cross-precompiled bits are provided, in a Yggdrasil-style fashion. This should solve the problem provided one has a good download speed and the pluto notebooks outsource the embedded definition of the environment to the remote one.
This is just (uneducated) speculation, I don’t know if there are any technical blockers.
I would prefer keeping --pkgimages=yes
as default for general use. I think people are more forgiving of delays at package installation than at every day use. For example it’s not uncommon for R to take ages when installing a new package, but it feels snappy afterwards.
A better change I think would be having something like Pkg.offline enabled by default, but only for Pkg.add
operations: Pkg.update
operations would work as usual, looking for the latest version of everything.
Yes, please, have an option to use pkgimages, some users may be willing to pay a larger cost upfront to have a faster notebook afterwards. This can be for example useful in demos where only the presenter runs the code, not the audience.
Side note, the fact that Pluto notebooks use temporary environment directory (I think?) means that packages installed by Pluto notebooks are more likely to be garbage-collected as their environments disappear immediately.
I feel similar to this (not sure about the default though). If we tell people to use environments as much as possible and precompiling these environments takes quite some time we should also make it super easy to ] add
a new package while reusing as many of the existing (i.e. precompiled) packages as possible.
For me personally, Pluto’s startup times have always felt too long but mostly because everything needs to run once before anything can be changed. And that always took a long time with Makie and AlgebraOfGraphics plots. If that problem gets even worse with pkgimages, that’s not good.
To get competitive with Javascript or Python solutions in terms of startup feel, we can only go in two directions, interpreting so that code starts running immediately, or building binaries beforehand. We now have binaries being built beforehand with pkgimages, but as the building happens on the users’ computers, there is no latency benefit for each very first startup.
I think we can never have a central repository of pkgimages because an image of a package is only valid for a fixed set of dependencies and their recursive dependencies, so nobody can store all those combinations. Therefore, it would make sense if the teacher could prepare the pkgimages binaries themselves, and offer them to load from a university server at the start of a course.
This would lead to two issues, one is that each platform would need its own pkgimages, so ideally a teacher would need to be able to generate such images on any of the supported platforms for all the others, kind of like cross compilation in Binarybuilder. I have no idea if this is technically feasible.
The other is that if there are 10 notebooks and let’s say Makie in each of them, but some dependencies differ slightly, Makie would need multiple pkgimages, increasing the amount needed to download. This could be tackled by creating one big environment that contains the dependencies for all scripts at once (unless that is not possible due to conflicts) and using that instead of per-notebook-dependencies. Maybe there could be a tool for taking a set of Project tomls, solving them all together and writing the subset of packages needed for each individual package out to each script’s respective Manifest.
I don’t agree. At any one point in time there will be only one set of (about 100) packages that is needed for Plots.jl for example. Each time a new version of Plots.jl is released this could be compiled on a central server. If people then add other packages it is only required that no packages are updated that are already installed. The same could be done with other large packages like DifferentialEquations. This could already save 50% to 80% of the local compilation time in many use cases.
Is the title supposed to just be inflammatory and incorrect? No, “Julia 1.9beta3 is much slower for first-time users”, just no.
Yes, there is a bit more precompilation, but that parallelizes fairly well to the point where it generally only becomes a problem for a few longer precompiling packages. But due to parallel precompilation and the fact that this is clearly a setup phase separated from the usage (and still much faster than the installation you get from R and Python BTW, CRAN takes for freaking ever to install packages), it’s fine. And you get a great improvement in TTFX. So empirically no, it’s just isn’t the case.
Did you mean it’s much slower for first time users only in the Pluto case? Then why isn’t that in the title?
I use a U-series laptop for mobile work (I bought it to force myself to start working on TTFX, and as you can see, that plan worked), which is basically a chip that’s powered like a cellphone, and I can say the experience is much better now on these. It’s also better for usage on cellphone/tablet chips, which of course is much more niche. Just a year ago TTFX on these devices was around 2-3 minutes, now it’s down to 0.5 seconds. And the precompilation time has barely 2x’d. Of course, the main thing with these types of devices are the single core performance is drastically underpowered, while having many cores, and so putting things more into parallel precompliation is a huge win.
Is the title supposed to just be inflammatory and incorrect? No, “Julia 1.9beta3 is much slower for first-time users”, just no.
I think rather than push on rhetorical style, the Julia community could improve some of these concerns by having a top-level dashboard it tracks daily of a few core metrics of latency that will be the main goals for 1-2 years of development. Making constant measurable progress on a set of metrics everyone agrees matters will work more effectively than verbal debates.
But it does not have to be a new version of plots. If one of those 100 dependencies releases a minor new version fixing some spelling error in a doc string Plots.jl
needs to be recompiled. For a large set of dependencies those changes occur fairly often. And compat entries normally do not fix the version and only establish bounds. Throw in some more packages to the same environment and the number of possible package version combination explodes.
I think deferring environment updates as @carstenbauer sugested would really help with the problem. Especially adding new dependencies without implicit update of everything. It might be enough to just tell the user that there are updates available and they can get them using ] up
.
I also agree this thread seems a bit surprising. For first time users, we could print a special message, “Hold tight while we precompile your packages to make everything go fast!” and then what you’re calling a loss turns into a win, because if they know why it’s happening most users will be patient while the process completes.
You’re complaining about a one-time 2x precompilation degradation when packages like Makie have had an every-time nearly 100x TTFX win. While I sympathize about the concerns about the 2x precompile degradation, I agree with Chris that the title of this post is just wrong, and that’s been backed up beyond a shadow of a doubt by (non-Pluto) “new user” reactions I’ve seen in my teaching.
Indeed, I’d go so far as to say that Julia 1.9 may be the first Julia version that’s truly suitable for new programmers. Julia is great for advanced classes, but for an “intro to programming” class, too many things in Julia are slow without it delivering the advantages that Julia brings to more advanced programmers. With Julia 1.9, I’m moving an introductory programming class from Python to Julia precisely because you can plot in a reasonable time now.