First Pluto notebook launches are slower on Julia 1.9 beta 3

This kind of resources might not be given easily to teachers in some universities. I know that the university I’m in doesn’t have (or want to give?) that budget to science programs, which are the ones that would benefit the most of this.

I think what I am missing here is whether the issue is with the design of Pluto itself. Does it create a temporary environment not shared between notebooks, for example? For those of us using jupyter notebooks for courses we ship a manifest file in the root of the repo and tell students to instantiate it - with the caveat that it will take a while the first time and then things are fast. And for things like binder that ] instantiate process happens automatically so the built container starts fast.

I can understand the expectations of users that TTFX should be fast… but we should set expectations that the time to install packages and compile system-dependent binaries may be slow? It is still faster than a lot of Conda/pip installations for example.

So maybe instead of directly looking at a TFITFPP with extra effort trying to speed up compilation (which is fine, as far as I am concerned) or turn off new features maybe Pluto needs a different way to manage its manifest and dependencies?

2 Likes

The rules for how we cache are complicated. Essentially we have a LRU cache which we probe for the cache file we expect. We reject cachefiles that weren’t build with the same Julia version or the same sysimg.

On the producer side we bifurcate the cache and then on the consumer side we probe the cachefiles and load the [first one]((julia/loading.jl at 1a0b92cff5cf55c45e9863ab648e340476cb8b59 · JuliaLang/julia · GitHub
)) that is valid for our session.

So what makes relocatability hard? As an example compile-time Preferences.jl, embedding of machine paths, mtime of source files. I think we can fix the latter two, but the first one is harder (but maybe not as much needed) also see Make precompile files relocatable/servable · Issue #47943 · JuliaLang/julia · GitHub

8 Likes
  1. IMHO Pluto actually manages manifest/dependenies very nicely and the decisions made by its devs are sound. Going further, Pluto notebooks can (and should) be an integral part of the ecosystem (I would call them Packagelets).

  2. As for TFITFPP - measurement is a first step. The measured values need not be acted upon or factor in so highly in ‘the utility function’. At the tails, a 24h install time is a no-go, and at the fast tail, some ecosystems (WASM) have gone down from around 1min to <10ms of install time* through intense optimization and caching. So there is a ways to go. The audience and effect of various types of speed/latency improvements are often unpredictable.

* for some definitions of install time

1 Like

Pluto environments are fully specified environments, so those should be in some sense building and sharing a whole ji/dll image. While general packages cannot do what @davidanthoff suggests, a Pluto notebook certainly could just shipped the already compiled code (though compiled to one machine, but maybe binarybuilder etc. etc.). A binarybuilder service for a Pluto notebook would solve the issue very nicely.

5 Likes

packagelets

I think they are merely “projectlets” as they fix a manifest file, and then we also can just call them “apps”… which they IMHO really are…

Otherwise I fully agree.

EDIT: but then (re Chris) they are indeed more than fixed “apps” as they encourage experimentation by adding cells, code etc. so they are not static, so binarybuilding a package set of a notebook may be not worth the effort. But then: how about think more generally - binarybuilding a Manifest.toml ? How useful that would be ?

Does this line mean that cache files are never shared across projects? If I have two projects, and both use the same version of DataFrames, I’ll get two distinct cache files? When I played around with this on the REPL, that seemed not to be the behavior I was seeing? Or is there something clever going on that cache file for project A is going to be reused by project B if all the other deps match, or something like that?

That is exciting to see! To me this sounds like it is not easy, but not impossible in principle?

So right now I’m a bit confused, because @ChrisRackauckas seems to say that it is just not feasible in principle? Which one is it?

Does this line mean that cache files are never shared across projects? If I have two projects, and both use the same version of DataFrames, I’ll get two distinct cache files?

No. It means if we don’t find a cache-entry that matches what we are looking for we mix in the current project to avoid stomping the cache of a different project.

From my understanding talking with Keno and Jeff, incremental compilation stuff isn’t “too feasible”. But there are ideas thrown around of precompiling sets of package environments that are commonly seen. For example, using A,B,C,D would build put one’s image into a state that might be “very different” from the Base one. So what could be nice is to have a system image with using A,B,C,D, so that when adding E you get incremental compilation but hopefully “not that much”.

Now what could happen is that there are a few big libraries with many dependencies that are commonly used. Thus while in principle there is a combinatorial explosion in the number of possible images one could have (and operating systems), it could happen that there are some very common sets. If telemetry was setup to ping a server, it could then build images for say the top 100 used image sets, and then send the image if it’s seen, and only build the image locally if it’s a cache miss. That would then need to update itself often because the images would need to change as the versions of the dependencies change, but it could reset every day.

Anyways, that’s super pie in the sky. I’ve been at a Hackathon all week with a few JC compiler folks and this has been the kind of pie in the sky kind of talk. Please no one expect any of that to materialize. So for right now you see a bunch of small PRs around improving inference/compile times, but note that there are also some ideas being tossed around for how to feasibly ship a reasonable amount of images.

4 Likes

As someone who runs three different classes using Pluto both for lectures and for exercises I have to thank @fonsp for bringing this issue up and I would wish that the concerns are taken seriously.

A game changer of Pluto was the introduction of the build-in package management. It means that I can distribute notebooks to students and they will just work because the correct package versions are installed. In my point of view it is no option to go back to external environments for a class to solve the latency issue. Environments are an expert topic that students should not worry about when using Pluto/Julia just as a tool for the class.

As noted above I am really appreciate that Pluto always comes with a well designed user interface that usually hides the complexity. Is there some Pluto issue where class-based optimization you indicated can be discussed?

11 Likes

This thread is motivated by issues observed in Pluto, but the actual issue with long precompilation is much wider than that - not directly related to notebooks at all.

It’s often adviced that users create separate environments for different projects/analyses/one-off work/etc. That’s a great advice, and it becomes easier and easier in Julia: REPL tools like ]activate --temp and autoinstall of used packages, notebook tools like Pluto, and more.

However, Julia 1.9 will make some of these usecases noticeably slower. The main reason is that precompilation is not a one-time process that needs to be done once for each package. Instead, packages need to be recompiled whenever any dependency updates - and it happens quite often.

I crafted a simple script to benchmark and quantify this issue: env_benchmark.jl · GitHub. It creates an environment with the same packages, but uses snapshots of the ecosystem from different days. This allows measuring

  • time taken by the original (first-time) precompilation when a user creates such an env and performs analysis once,
  • and time taken when he creates a very similar env with the same packages after a week or so - eg for another analysis

The script loads Tables, StructArrays, CSV, FlexiGroups, reads a simple CSV table and groups it - feel free to modify the code linked above and try your favorite workload.
Results:

  • Julia 1.8
    • initial precompile 34 sec
    • precompile of a similar env after 10/20 days: 14/30 sec
    • TTFX after precompile 14 sec (includes using; mostly taken by CSV.read)
  • Julia 1.9
    • initial precompile 90 sec
    • precompile of a similar env after 10/20 days: 44/84 sec
    • TTFX after precompile 3.5 sec

Imagine a user who performs similar analyses in Julia once a couple of weeks for different projects. He creates separate environments for reproducibility, as advised. In Julia 1.8, he experiences TTFX of roughly 35 sec on the first run of each analysis, and 14 sec on consecutive runs without any changes in the env. On 1.9, these times are 65 sec first / 3.5 sec after.

If each of these analyses is run only a few times (or within the same julia session!), 1.9 is noticeably worse for this kind of user. A similar experience occurs if one doesn’t create new envs, but updates/adds new packages to the same env.

Even more, compilation cache has to be evicted at some point, not to take up the whole hard drive. This means, running some analysis from a year ago would also give longer TTFX on 1.9.

Pluto notebooks is just a specific manifestation of this general issue.

24 Likes

Any idea what happens if one sets

Pkg.UPDATED_REGISTRY_THIS_SESSION[] = true

by default everywhere?

(one note: I think that problem is even worse if using a bloated main environment - any update in any package may trigger the compilation of much more packages).

Maybe the package registration system could accumulate the updated versions of packages for some appropriate amount of time, and then release a set of them to the world all at once, so that triggering recompilations would happen less frequently, and more of them would get done when they do happen.

To cut down the average compile times we just need to more often be using the exact same packages, and work how to make that happen both by default and by manually specifying when we want it.

Pluto notebooks need to inherit versions from the parent manifest so unless you want it to be otherwise, the versions in your notebooks created in the same timeframe will be the same. There should also be a way of forcing multiple notebooks to resolve to the same dependencies.

There is discussion on slack of just turning off update on add, which would do a lot. I think we also need tools to merge manifests - to make a manifest use the same dependencies as another one, to resolve possible shared manifests, etc.

Then, say for a series of workshops, we can get a bunch of notebooks, find a master manifest for them all, and move them all to it so there is really only one lot of compilation for everything.

7 Likes

That sounds to be a good direction. Translated to Pluto it seems that we need a way to

  1. either connect several notebooks into a group sharing the environment.
  2. or, let one chose from a global “Environment store” where one can manage and assign environments to notebooks. The environments could be stored somewhere in .julia

The second option seem to be more general. The first option is interesting since it might give some nice UI for notebook groups where one could have all the notebooks grouped together in a side pane.
In both cases it would be important that the notebook is still self contained and can also be shared individually.

1 Like

FWIW: A while back we solved this with a mini-package based on a suggestion from @vchuravy in https://github.com/JuliaLang/IJulia.jl/issues/673#issuecomment-425306944 with GitHub - QuantEcon/InstantiateFromURL.jl . In the end, after Julia package versions became less fragile we decided just to have a shared root manifest within the notebook directory and tell people to clone the whole thing. So while this package is no longer activately maintained, maybe some variation of Valentin’s original suggestion could be used to make pluto notebooks relocatable and yet share a single manifest.

again, basically 🎁 Package management · fonsp/Pluto.jl Wiki · GitHub

you can utilize “named environment” feature from Julia itself too

1 Like

I also think this moves in the right direction. For Pluto IMHO it would be good to have a tool which makes a bunch of notebooks having the same environment. Technically it is of course already possible with activating a joint environment in a directory, however this would disable PlutoPkg which in particular for teaching to beginners is a godsend…

1 Like

I was curious about actual times, so I ran this command:

time julia +$VERSION -e 'push!(empty!(DEPOT_PATH), mktempdir()); import Pkg; @time Pkg.add(split("Pluto Plots PlutoUI DataFrames CSV GLM Statistics LinearAlgebra Distributions")); @time include("linearmodel_datascience.jl")'

I ran it twice: once where I set VERSION=1.8 to use 1.8.5 and VERSION=beta to use 1.9-beta3. These are the numbers I got:

  • 1.8:
    • 142s total
    • 103s install
    • 36s run code
  • 1.9:
    • 227s total
    • 192s install
    • 24s run code

I also started the actual notebook in Pluto and waited for it to install and run everything, which on 1.9 took 236s to install and 40s to run code; on 1.8 running the code didn’t work, complaining that Plots wasn’t installed, but installation took 149s. But the times in Pluto were similar to running non-interactively with some additional overheads, so I think we can extrapolate.

Some observations:

  • There’s definitely a significant slowdown on the end-to-end process by some 60% in my single data point. Installation got 85% slower but loading packages and running code got 33% faster. If you’re not constantly upgrading your packages, that’s overall a big win since that 33% is every time you restart and that 85% is rarely.
  • If you’re using command-line Julia and Pkg, then package installation is very clearly a separate software install step, which we’ve consistently found is a place where people are psychologically willing to wait more so that they can wait less later when they actually want to use the software.
  • Pluto has a very different experience that puts package installation back in the path of code evaluation, where we’ve found that people really hate it. It also reduces all the visual feedback that Pkg gives on what it’s doing to some grey spinning wheels that seem to be spinning for ever.
  • It seems worth considering how to better separate in Pluto the software installation stage from the software usage stage. Maybe take using Foo statements out of the notebook entirely and have a GUI widget on the right where the user manages what packages they want, which can give better feedback on download/install/compile and maybe give some ability to manage what versions are used.
39 Likes

I also feel that Pluto can alleviate some of the pain by changing UI. Honestly, and this sounds sketchy but really it isn’t, just make the notebook display slower. Start execution as soon as it is launched, but include some fancy graphics for the front end. Maybe make it silly like a type-writer is filling in the cells or something. Now people say “wow fancy update to Pluto” instead.

[default but make it optional of course]

2 Likes