Julia 1.9.0-beta2 precompilation taking almost twice as long

To me, it feels worth optimising for latency after precompilation because the number of notebooks started for a given Pluto version is presumably a lot larger than the number of times Pluto is updated.

Using some of the numbers above, you only need to start 4 notebooks to have gained time from the new system.

2 Likes

May be he meant delete the .julia/compiled/v1.9 folder, which is an operation I do in order to clean the compilation cache. I agree that deleting something in .julia could be dangerous - then, is there a Julia-approved way to clean the compile cache, eg, by calling some function inside the REPL?

Precompilation of Pluto itself is only part of the equation. The more notebooks one has, the more precompilation time they pay.

Environments of each notebook need to be precompiled after: updating julia, updating any package in that env, adding/removing deps in those envs, cache cleanup, creating/loading new notebook, maybe something else.
So, if precompilation of many packages actually got much slower, as this thread suggests, then TTFX visible by the majority of Pluto users would significantly suffer. Yes, running the same notebook again without changing anything would be faster, but both scenarios are very common.

1 Like

@ufetcner7, no, I did not delete .julia folder. I am a user, and I am always afraid of taking some sophisticated steps into the unknown. I put myself in the footsteps of my students: I need to start a new Julia session, import and run Pluto, and open a notebook. In my trials, I did precisely this several times.

Apparently, the latest version of Pluto (v.0.19.20) does the job significantly faster in Julia 1.9.0-beta3 than in v.1.8.5 (I tested with different notebooks and different packages). In 1.9.0-beta3, importing and running Pluto is around three times faster (2x faster in another computer) than in v.1.8.5. Still, the gains are also visible when opening a notebook (23% and 15% in two different computers). I think the time it takes to open a notebook is the most crucial factor for users, like myself or my students.

Nevertheless, I kept in my notes that the first time I used Pluto (v.0.19.20) with Julia 1.9.0-beta3, it took 110s to precompile Pluto and 4’45’’ to open the major notebook I used in these trials. I understand that such a long time may be detrimental for some users, but for those that use Pluto notebooks a lot, the crucial point is how long it takes to have an average-sized notebook running on one’s computer.

If your package is loaded, you can use Base.compilecache(Base.module_keys[the_module]) to execute and cache precompile statements again.
I also use rm.(Base.find_all_in_cache_path(Base.module_keys[the_module])) to remove the .ji files from within the REPL.

1 Like

Thanks!

Not really, the more unique packages one has, the more you have to precompile.

updating julia

Yes

updating any package in that env

Only those packages that update (and their reverse dependencies)

adding/removing deps in those envs,

Removing deps does not require precompilation, adding one is similar to updatting.

cache cleanup

Not sure what this is.

new notebook

Only if it uses other packages than the other notebooks.

Unique package-version combinations, including transitive dependencies. It’s very common to create two notebooks within a few days and end up with at least one different package version in the manifest.

Same.

Julia doesn’t store all precompiled codes for all package-versions forever, right?..

1 Like

Well, if you want to measure the precompilation time of a package you must start with an empty .julia folder. Of course you can also rename it before doing the measurement which is more safe.

I often do my measurements in a virtual machine which I do not use for other tasks which is also safe. But deleting the .julia folder should always be safe if you have your own code in a git repository on a server.

That’s inaccurate. You only need to delete compiled/vX.Y subdirectory of the depot, and that’s safe to do because that only contains autogenerated files.

Which you didn’t suggest. People believing that your suggestion was safe may lose their development code.

3 Likes

If they don’t have it in git which is stupid in the first place. But you are right, scientists are often not programmers, sometimes I forget that, sorry…

People may have not had the time to push the latest local work.

Thanks a lot, I did not know that…

Idea for @fonsp : (which is related to Allow to provide environment at notebook start · Issue #1788 · fonsp/Pluto.jl · GitHub ): it would be nice to have a tool which sets the environment for a notebook. Or any other way to make a bunch of pluto notebooks using the same package versions.

In my experience notebooks tend to end up with different package versions in their manifest. Just because you made them on different days. So using the same packages doesn’t really help.

1 Like

Hey @VivMendes! It looks like these benchmarks were done with a hot cache. Depending on your situation, this could actually be realistic (when students will often run the same notebook), and it demonstrates the advantage of Julia 1.9.

We have been running benchmarks where we also clear the cache to simulate the experience of first-time Julia users. For a class like https://computationalthinking.mit.edu/ , which has lots of notebooks with different packages, and (hopefully) lots of people trying to run various lectures and homeworks on their own computer or binder, the increased precompilation times are more noticeable.

I wrote more about my benchmarks and experience in this post :point_right: Julia 1.9 beta3 is much slower for first-time users

I wonder if the solution to slow precompilation in Julia will be similar to what the Python community did with conda, i.e. a set of environments precompiled and stored in the cloud with a set of common popular packages.

If packages such as Plots.jl, Makie.jl and other heavy packages that are widely used lived in a central hub with precompiled sysimages, then end-users could ask the package manager to download these generic binaries from the central hub instead of compiling locally? This would improve the user experience for most users who do not want to modify packages, but just use stable releases.

If the no-free-lunch theorem applies here, TTFX moves to precompilation, and we will never be satisfied with the total time. We could exploit good internet connection to alleviate the problem.

5 Likes

Would it be possible to store precompiled package images in the cloud? That could be more fine-grained than system images…

I mean, after Make precompile files relocatable/servable · Issue #47943 · JuliaLang/julia · GitHub has been implemented…

2 Likes

I’ve been considering this. You would precompute some “standard slices” of the current set of registered packages that cover the entire registry when you union them together, but which have some incompatibility between them. You can publish these along with the latest registry. Instead of resolving on the client side, you just find a slice that has all the packages you want; if your needs can’t be satisfied by any precomputed slice, then there’s some inherent incompatibility in what you’re trying to resolve. Benefits: fast, simple, very predictable resolution of requirements and you can precompile entire slices and serve the binaries. Drawbacks: I’m not sure about the feasibility of computing these, but I have some ideas.

17 Likes