First Pluto notebook launches are slower on Julia 1.9 beta 3

But it does not have to be a new version of plots. If one of those 100 dependencies releases a minor new version fixing some spelling error in a doc string Plots.jl needs to be recompiled. For a large set of dependencies those changes occur fairly often. And compat entries normally do not fix the version and only establish bounds. Throw in some more packages to the same environment and the number of possible package version combination explodes.

I think deferring environment updates as @carstenbauer sugested would really help with the problem. Especially adding new dependencies without implicit update of everything. It might be enough to just tell the user that there are updates available and they can get them using ] up.

2 Likes

I think Make precompile files relocatable/servable ¡ Issue #47943 ¡ JuliaLang/julia ¡ GitHub is one

I also agree this thread seems a bit surprising. For first time users, we could print a special message, “Hold tight while we precompile your packages to make everything go fast!” and then what you’re calling a loss turns into a win, because if they know why it’s happening most users will be patient while the process completes.

You’re complaining about a one-time 2x precompilation degradation when packages like Makie have had an every-time nearly 100x TTFX win. While I sympathize about the concerns about the 2x precompile degradation, I agree with Chris that the title of this post is just wrong, and that’s been backed up beyond a shadow of a doubt by (non-Pluto) “new user” reactions I’ve seen in my teaching.

Indeed, I’d go so far as to say that Julia 1.9 may be the first Julia version that’s truly suitable for new programmers. Julia is great for advanced classes, but for an “intro to programming” class, too many things in Julia are slow without it delivering the advantages that Julia brings to more advanced programmers. With Julia 1.9, I’m moving an introductory programming class from Python to Julia precisely because you can plot in a reasonable time now.

70 Likes

I understand we should listen to feedback etc. etc., but what I am saying is that

I am testing on a U-series Intel and the from scratch first install has gone from over 30 minutes (likely around 45?) to around 6-10 minutes. This is a CPU that is much slower than any normal/average i3/i5 and is only used in fanless ultrabooks. So “is much slower” for underpowered machines is a no, according to the measurements that were defined in the OP.

There’s FUD about (air quotes) “how bad is it on this machine that I cannot measure that some students might have to use?”, and I am saying I have the answer because I am testing it on an ultra underpowered machine (which optimizes battery life at the detriment of everything else) and the answer is that it’s empirically not slower. It’s much faster to do a from-scratch Julia installation to first solve! I understand listening to the feedback and everything, but that doesn’t change the actual measurements I am seeing on such a machine!

That said, there are a few things to note. I will note them in another way from my post (plus the slack discussions etc. on the same topic) above to better highlight them:

  1. It’s all improved because of parallel precompilation. I don’t use Pluto all that much so I don’t know, but my guess is that it disables parallel precompilation which would be the cause for this complaint. If that’s the case, then the real question should be, how do we get Pluto using parallel precompilation?
  2. Parallel precompilation requires higher energy usage which gets throttled on laptops when not plugged in. So the difference between having your laptop plugged in vs having it on battery is now huge. The plugged in performance is a lot better than v1.7 and such, but the battery performance seems to be about the same as before.
  3. When doing the parallel precompilation, it tends to sit on just a few packages for a majority of the time. OrdinaryDiffEq.jl is one of these packages, and we know this comes down to something RecursiveFactorization.jl-based but it’s not hit in the RecursiveFactorization precompilation which is confusing. So why? Likely it’s a union splitting issue and so @Elrod and I are at a hackathon this week and will be poking at this a bit.
  4. We are looking into the effect of adding more max_methods = 1 around, the effect on precompile times. Poking around and such. This has highlighted that the best thing would be tools to profile precompile times as right now it’s poking in the dark. There’s a lot of guess and check, and I think having more concrete tools would help make things more concrete and make it easier to find improvements.
  5. We are making a lot more packages use package extensions. I’ve copy-pasted the link to make Distributions a weak dependency by oscardssmith · Pull Request #854 · SciML/DiffEqBase.jl · GitHub to a few people around here, basically saying “follow this template and remove dependencies”. We have about 20 packages still on the list to remove things. The main thing is that StaticArrays should get removed from “most” packages to extensions and used only on-demand to match user static inputs. Package extensions should decrease precompile times a bit, especially since the extensions parallel precompile so it would make long precompiles split into parts that happen in parallel.

Nobody denies that there are things to do, but we should keep it concrete to make progress.

9 Likes

It it conceivable for some of this precompilation occur in background (maybe with an indicator somewhere) while the REPL keeps free to work with?

2 minutes waiting seems annoying, but nobody does anything actually useful in less than 2 minutes, so probably when the first function is written down or the first playing around is finished, it will be over.

That is sort what happens with me when I start VSCode. While it takes some time for the Julia server to be completely functional, at the time when I actually need it it is already functional.

2 Likes

For a first-time user, s/he doesn’t need to work with the REPL yet. You distribute a program, download_and_install_julia, that downloads Julia, every package that will be needed for the semester, and triggers precompilation. When the installer is done, you print “All done! Enjoy using Julia!”

Problem solved.

19 Likes

I mentioned that package extensions split out a package so that not all of precompiles unless the user has also added the package requires the weak dependency that triggers the extension. So this mechanism can be used to make the full precompilation happen incrementally and on-demand.

1 Like

How could I write such a program? In which language? Do you have an example?

3 Likes

The title of this post will catch a first-time user’s orJulia-curious person’s eye (it has the phrase “first-time user”). The first time user will not be able to understand any of the discussion in the comments, but will take away “don’t use Julia”. So the title ought to be more accurate: “Julia 1.9 beta3 is much slower for first-time users of Pluto”. Or maybe for “some use cases”. Then ask and suggest what changes can be made, including to Pluto itself, in order to reclaim and surpass previous performance.

Chris’ post could be more diplomatic. But it is very direct. I wrote this to separate one of his points from everything else.

I understand the the impulse given that the new release has effectively taken a sledgehammer to the fruit of your long labor. But, we have to maintain a broader perspective.

7 Likes

Totally agree. A useful approach would be to define a metric: TFITFPP (time-from-install-to-first-Pluto-plot), and see how it progresses. Usually install times increase with sophistication and optimization. This is a trade-off.

So the title could be:

TFITFPP is 30% slower on Julia 1.9 beta3

(or whatever the slowdown is).

Additionaly, the hardware people use matters, and some have less powerful devices, and the benchmarks should always look to those weaker machines as well (like mine).

My beginner class starts in the fall, so no. It would be a nice community contribution, and there are many who could write it. I’ve not yet played with juliaup yet, but I imagine this is the kind of thing that might be in its bailiwick.

I generally love metrics and measurements, and TFITFPP is brilliant as an acronym (I’ve been practicing how to say it :wink:), but it’s worth asking: how often do you see languages compared in terms of how long their installers take?

Oh, I won’t use anaconda for my course, because most machines already have some version of python already installed on them and the anaconda installer takes like 10 minutes!

is just not something I hear from people teaching Python. So while I’m all in favor of benchmarks, I think this may be one case where they would be overkill.

The point is, the concern of this thread is an installer problem, and people are used to installers taking some time. And other than the installation issue, I think the truth is indeed pretty much the opposite of the thesis of this thread. So I agree this thread leaves people with precisely the wrong impression.

11 Likes

There definitely is a point to be had about making such an installation seamless and pain-free, but indeed — the time it takes is generally not at the top of such a priority list.

So then the problem should perhaps be rephrased into making the installation/first-start process more streamlined.

2 Likes

GitHub - PetrKryslUCSD/VSCode_Julia_portable: Portable Julia running in VSCode?

1 Like

That is a legit concern. But easy to fix: just standardize all package versions at the beginning of the course. That’s totally doable now with a little scripting (copy versions from the “master” Manifest to each individual Manifest), but to make this easier on instructors it would be nice to automate that somehow.

3 Likes

The (Pluto specific) problem here though is that Plutos use of environments makes it likely that a user hits precompilation almost every time they open a new notebook, even if they’re reusing packages they’ve already installed before.

12 Likes

Gotcha (I’ve not used Pluto in a while). But the solution does seem to be in the post above: the instructor can just ship a Manifest with the same package versions used in the previous assignment and/or master Manifest.

4 Likes

Hi knowing the OP in person I don’t see any inflammatory purpose here while agreeing that may be the title could have been chosen such that the Pluto context would be clear. However I indeed at once admire the great work done for 1.9 and share Fons’ worries, so I appreciate the different ideas to find a constructive way out of this situation.

I am a heavy user of Pluto notebooks (not only for teaching). They are a great way to have students with minimal installation experience running Julia reliably on different computers across the OS spectrum. They can easily create Pluto notebooks for exam projects etc. which I can immediately run on my computer. So Pluto is indeed already for many the first experience with Julia and can even be a selling point for it.

I think on the Pluto side we e.g. could have more tools to handle notebook environments, so all notebooks for a course e.g. could share one and the same version set of packages, and there would be one setup notebook which would precompile them all. May be this could be done by a download_and_install_julia_with_pluto script…

For my purposes I mostly circumvent the TTFP problem by rolling PlutoVista.jl.

On the Julia side - for the reasons mentioned above I see Pluto on the same priority list as Makie, Plots, and the SciML universe, where great progress has been made.

12 Likes

What are the prospects of precompiling and hosting (including native code) in the cloud, and then having those files downloaded at package install time? I didn’t follow the details of the design, so not sure whether these native caches are per project (i.e. Manifest.toml) or per package. If precompile files were cached in the cloud, that could solve this problem, at least in the medium term?

Generally, while I think something like a script that handles course setup can maybe mitigate this problem a bit, I don’t think it really addresses the issue that @fonsp outlined at the top, namely that this will leave a weird impression for very, very first users. In my experience, these kind of workarounds typically lead to even more confusion because they then start to wonder why such a thing is needed? Why isn’t this just working “the normal way” without tricks that they don’t understand? Also, in such a situation, if they tinker even slightly and deviate from the provided environment from the instructor, they will still hit the problems @fonsp pointed out.

5 Likes

Invalidations make it so you’d have to potentially consider every combination of other packages in the system. Unless there’s some function freezing functionality, that’s not possible.

2 Likes