First Pluto notebook launches are slower on Julia 1.9 beta 3

Hi! :wave: Today, I ran some benchmarks to compare real-life launch times for first-time users of Julia 1.9.0-beta3 vs Julia 1.8.5. That means people who just installed Julia on their computer, and they are excited to get started!

Results

Summary of three samples, details below.

I found that starting Pluto the first time is 2.3x slower in Julia 1.9 (57.8 sec vs 25.4 sec). This is common in Julia 1.9: the increase in precompilation time is greater than the reduction in TTFX.

Once Pluto is running, running this basic data science notebook (Plots, CSV) is 1.6x slower (3 min 16 sec vs 2 min 2 sec).

This was on a very fast computer, these times might easily be double on an old laptop. More details about my benchmark here.

Bad first impression

In my experience (talking with university teachers and Pluto users), slow launch times are a big pain point of using Pluto and Julia. Teachers report that they worry about leaving a bad first impression.

It is easy to forget that launch times are experienced as disproportionally large by students because:

  • They have an empty .julia cache.

  • They use lower-power computers than their teachers.

  • They run many different notebooks, each with a new Manifest to be loaded (which might not have much overlap with previous notebooks). They spend more time loading, less time tweaking.

  • They are used to faster digital experiences (spreadsheets, email). Normally, seeing a loading bar stuck for 1 minute means that something is broken.

  • They might be used to faster programming environments (javascript, python). For example, this notebook, written in JavaScript, also loads a CSV dataset, runs some basic statistics and displays plots. With an empty cache, it downloads and runs in your browser in 3 seconds. That’s 100x faster!

Outside of education, I only have limited experience with first-time Julia users. But here, my worry is that during those extra minutes of loading bars, we are scaring away new users, without a teacher to tell them that this is expected (and better times are ahead).

My conclusions

On one hand, as an experienced Julia user, I really appreciate the improved TTFX when working in a constant package environment. And I am super impressed by the technical achievement of the feature!

But as someone trying to make Julia look fast and interactive on the web, and accommodating for first-time Julia users, I wanted to write this post to highlight this perspective. In its current state, Julia 1.9 is much slower for first-time users. Are we making a trade-off at the expense of new users?

The way I see it: On their first day, students would enjoy Julia 1.8 more than Julia 1.9, they are more likely to pick it up again, and to recommend Julia to their friends.

Solutions

The feature also came with a new command-line flag, julia --pkgimages=(yes|no). This empowers us to decide when to use or not use the feature, and we could potentially get the best of both worlds! :tada:

For Pluto.jl, we are currently discussing setting this to no by default for notebook processes and precompilation. Or we might find a dynamic approach, e.g. only using pkgimages when you run a notebook a second time, or using new GUI to let users decide.

Outside of Pluto.jl, what about setting this flag to no by default in Julia? This way, new users have a better first impression, and more experienced Julia users can use julia --pkgimages=yes to benefit from the feature. What do others think about this?

46 Likes

As a heavy Pluto user, I fully support your concerns about TTFX in the notebook context, and looking forward to Julia/Pluto improvements in that regard!
If setting potential pkgimages=no would shorten total TTFX times back to 1.8 levels, this may be a simple first step, while looking for even better solutions.

Generally, it’s great that the “each source file (notebook) is a separate isolated environment” scenario is so well-supported in Julia and Pkg! But seems like this scenario wasn’t really considered in the 1.9 precompilation development, and correspondingly TTFX times suffer.

3 Likes

Another possible solution might involve something similar to Pkg.offline, so when creating a new notebook, Pluto by default doesn’t try to fetch the latest version of the registry and uses versions of packages that may have already been precompiled.

I think that could be a way of avoiding the large initial precompilation cost while still taking advantage of the TTFX improvements.

8 Likes

There’s no second time if there’s no first time to start with.

2 Likes

The Pluto’s approach to package management was especially bad about this, where every new notebook will almost invariably end up hitting precompilation for the project.

My recommendation, regardless of whether pkgimages are used or not, would be to set a toggle which defaults to disabling pre-compilation of Pluto notebook environments which use Pluto’s automatic package managment system.

the assumption there being that since these are essentially scripts, precompilation is a waste of time since it only really benefits re-running the notebook, and imposes a hefty penalty the first time the notebook is run.

9 Likes

I believe that precompilation in Julia 1.8 makes startups faster, not slower, because precompilation is parallelized.

Depends. You’re still usually going to be precompiling a lot more method signatures than need to be regularly jit compiled unless the notebook is quite big and intensive.

E.g. if you’re precompiling all of Plots.jl just for a little lineplot with a slider, then it doesn’t matter if it’s parallel, you’re still wasting a lot of time. Especially since as you say, the usecase you care about is students who tend to have weaker laptop CPUs (I.e, fewer cores)

Although, I haven’t tested this extensively so if you have evidence saying otherwise for small notebooks on weak CPUs, ignore me.

2 Likes

That’s a good suggestion for new notebooks! It does not affect launch times for new users though. We already have some tricks to avoid fetching the registry as often. :+1:

Ah, that’s good to know! I just mentioned registry updates because I had issues on really slow internet connections before, but for most users I’d suspect this not to be that big of an issue anymore.

I think mainly it would be great if there was a way to easily create a startup package as described by @tim.holy in Question about using SnoopPrecompile - #9 by tim.holy for use by new Pluto packages, but which would also fix all recursive dependencies to the precompiled versions until a user chooses to manually update them.

1 Like

That’s a cool trick! This sounds less useful to first-time Julia users though, right? By first I really mean “first day, let’s try out Julia!”

I think the use case I had in mind and previously talked with @alanedelman about was in a classroom setting, where it wouldn’t be a big issue to tell students at the start of the semester to run these commands and just wait a bit if that meant using Pluto was always supper snappy from there on out.

4 Likes

How about a package which creates template notebook environment that is then copied when someone creates a new notebook?

This template notebook environment could be precompiled during the build phase of the PlutoBasicScienceTemplate.jl package and then reused. Thus students could be instructed to just execute the following commands.

using Pkg
pkg"add Pluto PlutoBasicScienceTemplate"

The default template to use could be configured via Preferences.jl or selected as argument to Pluto.run(). I find that users may tolerate a one time install that runs long, as long as they do not incur this cost repeatedly.

Combined with the Pkg.offline option above, this shift precompilation times to a one time upfront cost rather than a per notebook cost.

4 Likes

At some point, julia could have a Registry of environments for which cross-precompiled bits are provided, in a Yggdrasil-style fashion. This should solve the problem provided one has a good download speed and the pluto notebooks outsource the embedded definition of the environment to the remote one.
This is just (uneducated) speculation, I don’t know if there are any technical blockers.

1 Like

I would prefer keeping --pkgimages=yes as default for general use. I think people are more forgiving of delays at package installation than at every day use. For example it’s not uncommon for R to take ages when installing a new package, but it feels snappy afterwards.

A better change I think would be having something like Pkg.offline enabled by default, but only for Pkg.add operations: Pkg.update operations would work as usual, looking for the latest version of everything.

7 Likes

Yes, please, have an option to use pkgimages, some users may be willing to pay a larger cost upfront to have a faster notebook afterwards. This can be for example useful in demos where only the presenter runs the code, not the audience.

Side note, the fact that Pluto notebooks use temporary environment directory (I think?) means that packages installed by Pluto notebooks are more likely to be garbage-collected as their environments disappear immediately.

3 Likes

I feel similar to this (not sure about the default though). If we tell people to use environments as much as possible and precompiling these environments takes quite some time we should also make it super easy to ] add a new package while reusing as many of the existing (i.e. precompiled) packages as possible.

6 Likes

For me personally, Pluto’s startup times have always felt too long but mostly because everything needs to run once before anything can be changed. And that always took a long time with Makie and AlgebraOfGraphics plots. If that problem gets even worse with pkgimages, that’s not good.

To get competitive with Javascript or Python solutions in terms of startup feel, we can only go in two directions, interpreting so that code starts running immediately, or building binaries beforehand. We now have binaries being built beforehand with pkgimages, but as the building happens on the users’ computers, there is no latency benefit for each very first startup.

I think we can never have a central repository of pkgimages because an image of a package is only valid for a fixed set of dependencies and their recursive dependencies, so nobody can store all those combinations. Therefore, it would make sense if the teacher could prepare the pkgimages binaries themselves, and offer them to load from a university server at the start of a course.

This would lead to two issues, one is that each platform would need its own pkgimages, so ideally a teacher would need to be able to generate such images on any of the supported platforms for all the others, kind of like cross compilation in Binarybuilder. I have no idea if this is technically feasible.

The other is that if there are 10 notebooks and let’s say Makie in each of them, but some dependencies differ slightly, Makie would need multiple pkgimages, increasing the amount needed to download. This could be tackled by creating one big environment that contains the dependencies for all scripts at once (unless that is not possible due to conflicts) and using that instead of per-notebook-dependencies. Maybe there could be a tool for taking a set of Project tomls, solving them all together and writing the subset of packages needed for each individual package out to each script’s respective Manifest.

5 Likes

I don’t agree. At any one point in time there will be only one set of (about 100) packages that is needed for Plots.jl for example. Each time a new version of Plots.jl is released this could be compiled on a central server. If people then add other packages it is only required that no packages are updated that are already installed. The same could be done with other large packages like DifferentialEquations. This could already save 50% to 80% of the local compilation time in many use cases.

3 Likes

Is the title supposed to just be inflammatory and incorrect? No, “Julia 1.9beta3 is much slower for first-time users”, just no.

Yes, there is a bit more precompilation, but that parallelizes fairly well to the point where it generally only becomes a problem for a few longer precompiling packages. But due to parallel precompilation and the fact that this is clearly a setup phase separated from the usage (and still much faster than the installation you get from R and Python BTW, CRAN takes for freaking ever to install packages), it’s fine. And you get a great improvement in TTFX. So empirically no, it’s just isn’t the case.

Did you mean it’s much slower for first time users only in the Pluto case? Then why isn’t that in the title?

I use a U-series laptop for mobile work (I bought it to force myself to start working on TTFX, and as you can see, that plan worked), which is basically a chip that’s powered like a cellphone, and I can say the experience is much better now on these. It’s also better for usage on cellphone/tablet chips, which of course is much more niche. Just a year ago TTFX on these devices was around 2-3 minutes, now it’s down to 0.5 seconds. And the precompilation time has barely 2x’d. Of course, the main thing with these types of devices are the single core performance is drastically underpowered, while having many cores, and so putting things more into parallel precompliation is a huge win.

16 Likes

Is the title supposed to just be inflammatory and incorrect? No, “Julia 1.9beta3 is much slower for first-time users”, just no.

I think rather than push on rhetorical style, the Julia community could improve some of these concerns by having a top-level dashboard it tracks daily of a few core metrics of latency that will be the main goals for 1-2 years of development. Making constant measurable progress on a set of metrics everyone agrees matters will work more effectively than verbal debates.

39 Likes