Given that there is already a prompt asking which environment to install it in, would it also make sense to have a prompt asking if we want the latest downloaded or latest remote version?
The example sounds arguably not so common because (1) most people focus on one or a few projects at the time, (2) some people will have large projects with long running times where compilation time doesnât matter, and (3) if you can wait a week before starting the notebook again then compilation doesnât matter so much.
For power users and package developers, Julia 1.9 makes work a lot faster again.
How does 1.9 make Julia faster for developers?
For me, every restart of julia is slower now because Iâve usually changed some code in a package that triggers recompilation in a whole tree of packages.
I love that other packages load really fast now, but if you are working on 50% of the packages you are using the overall experience is not faster.
Yes. And I would argue even moreso for entry users. When people are learning the language for a particular application they donât add in packages all that often. And if students are learning from a set of lecture notes then those lecture notes have been tested against a particular manifest so compilation is a one-time thing and I donât even want them messing with versions.
I think you should consider whether your workflow needs to be modified to prevent doing package operations on every restart. For any project with non-ephemeral code you should always work with a manifest, and if soI am not sure why package operations would happen very often triggering reinstallation.
For tinkering around the best workflow I am aware for julia is creating either a new âprojectâ to add packages/etc. or a persistent âsandboxâ project you can mess around with. This is the same with python⌠I tell people to alway use conda virtual environments and expect things to be slow when they add packages.
Compiler development resources are the most scare resource in the community, so time is spent making julia compile faster could be spent on things like better support for AD/etc., fixing performance regressions, further lowering the TTFX, etc.
Having some sort of compile-in-the-background solution, as at least one person suggested above, would be really helpful here.
Ideally, the TTFX should be at most the time it takes to compile everything needed to do X immediately after Julia startup; and after all background compilation is done, TTFX is just the execution time.
I would say 1.9 is faster for developing top-level packages because the dependencies stay intact between restarts. Itâs slower for testing lower-level packages by running higher-level ones that depend on them. Part of the issue is that many packages added comprehensive workloads for precompilation, which run again and again without you as a developer profiting. There are some options to disable at least SnoopPrecompile.jl blocks via local preferences, but itâs not the most convenient solution because there might be many such packages and you have to disable them all manually.
@tim.holy are there any native code cashing between different package versions to eliminate compilation of duplicated code blocks even if there are some changes between versions? I doubt that some low-level stuff changes very often even if package version is increased.
You can just disable pkgimages with a command line flag if that is your workflow and you get back to 1.8 style precompilation. You can even alias your default Julia to do that and you havenât lost anything.
Totally, thats probably a good idea. But my comment wasnât complaining about the workflow, I was just pointing out that the idea that 1.9 is faster for development is not broadly true.
@jlperla thatâs an ideal separation of workflows. But if you maintain 10s of the packages in the toolchain you are using for some project, you are likely to be using recent versions because you bumped them for things you need yourself, and are also likely to hit bugs and required fixes in these new versions regularly.
You can often solve any problems by fixing the package directly - much faster than doing some workaround in your scripts. Then you get your work done, and PRs pushed as a bonus for everyone else. But 50 packages recompile on restart.
I would recommend using containers. Docker is fine for personal use but for a class or so you could try https://apptainer.org/. You can run it within the IDE of your choice (e.g., VS Code).
Iâm not gonna argue about how relatively frequent different scenarios are, as we are unlikely to have any real data on it. But IME this is a pretty common scenario in sciences.
For long-running tasks, all this TTFX discussion and its improvements are irrelevant anyway.
As others already said, thatâs also far from the case for many usage scenarios. I consider myself a power user, and find it likely that 1.9 will make many of my workflows slower. Luckily, there seems to be a switch that enables older, faster precompilation.
I believe itâs important to focus on true TTFX as well, considering recompilation that is likely to happen regularly in common recommended workflows.
TTFX is exactly whatâs âcompile fasterâ is about. And (arguably?) much more users benefit from TTFX improvements than from better AD.
Thatâs totally true! I seem to notice that some common packages take significantly longer to precompile now than a few months ago, but donât have hard data on that.
To me that is package installation and resolution, not TTFX. Compile everything, install packages, and take whatever time you need to make things as fast as possible afterwards - and then amortize that fixed cost by using that snapshot of packages and changing them infrequently. Personally, I want even slower compilation times because I know it comes with more caching and testing/removing invalidations! Generate as much as possible with the compilation process during the installation on the off-chance that I will use it down the road it is super-fast.
So as an outsider (who tried Pluto and got confused for this exact reason) it sounds like Pluto (and related workflows) need to find better ways to amortize that fixed cost. To me, that is what Project/Manifest files already do (not to mention making things more reproducible). I take @Raf
point that with people maintaining 50 repos maybe there isnât a better option though, so it is nice to turn off the caching for them. But that is a small segment of the community.
Consider a different language for a second. Lets say a workflow required me to install pytorch (which has a bunch of binaries and takes forever) and supporting libraries for my python notebooks every time I opened a notebook. And if I had a dozen notebooks then it potentially did that every time, or whenever any of the dependencies had minor updated? I wouldnât blame conda or pip for that, I would rethink how I am using them. Even containerized setups like binderhub build an image from the pip requirements or Project.toml file taking as much time as they need and make things blindingly fast afterwards.
Or, you know, switch to the older faster precompilation mode (: Maybe, all those new fashionable long precompile blocks in packages can also easily be disabled by some flag?..
To make multiple analyses/scripts/notebooks/⌠both reproducible and independent wrt future code modifications, they need to live in separate environments. These environments differ in some dependency versions even if you use very similar package sets everywhere, just because packages get updated and newest versions are installed. So, many packages require recompilation when starting work in a new env like this, adding some other package, or just when compile cache gets cleaned up.
Why consider a different language with its own tradeoffs?
These workflows I described above are pretty common in Julia, and they will suddenly become significantly slower in 1.9. They are possible to execute faster, as evidenced by earlier Julia versions.
Sure, and some capable tooling is one way to alleviate the very long precompilation time in Julia. For example, make it convenient to share dependency trees between envs: when I create an env and add some package, itâs likely that an already downloaded and precompiled version would be fine for me, instead of the absolute latest thatâs not used in any other env.
The best from a user PoV would be something like:
Its actually worse than just people who maintain 50 repos. If you maintain just a few very low level repos you will get the same effect. For me just working on GeoInterface.jl or ConstructionBase.jl will cause huge cascades of recompilation, taking quite a few minutes.
But yes the more packages you maintain of all levels, the more of this will happen with more regularity.
But there does seem to be a fundamental clash between having clean separate environments and having fast compilation.
This is why I suggested some Pkg.jl tools to merge/standardise manifests so multiple projects can use as many of the same dependencies as possible. We could pass some flag to either optimise for newest packages, or fastest compile times.
I think I may just be dumb, but wouldn´t be much improved if environment dependencies where treated more as we treat package dependencies: by installing the latest (offline or online, thatâs another story) version of the package compatible with the project? We donât depend on Manifest files for package compatibility, why would we want that, in most cases, for environments?
(and nothing against having the great feature of being able to have exactly reproducible environments, just I donât think that is what most users generally want).
That could be solved by just creating a new command that instead of being called instantiate
could be called resolve_deps
, or whatever, to install the latest available non-breaking version of every package, according to what is present in the Manifest.
Just curious, for people who are annoyed by precompilation time, how many are running with
# Don't update the registry without explicit instructions to do so
# This reduces the amount of precompilation needed
using Pkg
Pkg.UPDATED_REGISTRY_THIS_SESSION[] = true
in your startup.jl
? Iâve only started this recently so I donât have a lot of data, but it seems like it should help quite a bit, and anecdotally it seems like it might already be noticeble. Is anyone using that and still disliking the 1.9 tradeoff?
I started using Pkg.UPDATED_REGISTRY_THIS_SESSION[] = true
since reading about it in this thread and it does help a bit. At first I assumed from the name that it would only help within a session, but now I understand that it will always stop Pkg.add
from updating the registry. I guess the main reason I still see more precompilation than I want is because I run Pkg.up
regularly when working on small projects. I do avoid running Pkg.up
on a large project unless I can wait, but the other Pkg.up
call will update the registry. That means that if I add a package to a large project, it will often install new low-level dependencies and trigger a lot of precompilation, despite UPDATED_REGISTRY_THIS_SESSION
.
For me this discussion was never about disliking the 1.9 tradeoff, I love pkgimages, thank you for working on it. My main point is that the longer precompilation makes Pkg.add
behavior worth reconsidering, to avoid waiting for precompilation in a situation where I think many people donât want to, hence I argued above to make it add already installed versions when possible.
That means that if I add a package to a large project, it will often install new low-level dependencies and trigger a lot of precompilation, despite
UPDATED_REGISTRY_THIS_SESSION
.
I think part of the problem here is this:
pkg> add
defaults to preserve=PRESERVE_TIERED
, which is a sensible strategy IMO that tries all of the above strategies from the top down until the package can be added.
| Value | Description |
|:------------------|:------------------------------------------------------------------------------------|
| `PRESERVE_ALL` | Preserve the state of all existing dependencies (including recursive dependencies) |
| `PRESERVE_DIRECT` | Preserve the state of all existing direct dependencies |
| `PRESERVE_SEMVER` | Preserve semver-compatible versions of direct dependencies |
| `PRESERVE_NONE` | Do not attempt to preserve any version information |
| `PRESERVE_TIERED` | Use the tier which will preserve the most version information (this is the default) |
That should usually work well, but one exception is that Pkg currently does not know how to handle build numbers on packages, and will always update if a new build is available.
JLLâs use build numbers, and so can punch through this preserve behavior causing updates that are usually quite low in the dep tree.
Take here where the registry is checked out before and after a new jll build is available.
ImageMagick_jll v6.9.11+3 â v6.9.11+4
happens even though itâs completely unrelated to Example
.
shell> cd ~/.julia/registries/General
/home/ian/.julia/registries/General
shell> git checkout ae8ec3b695efb04ddec4371b97477779b6c62549
HEAD is now at ae8ec3b695 New version: OpenAPI v0.1.7 (#77093)
(@v1.10) pkg> activate --temp
Activating new project at `/tmp/jl_bgg4u9`
(jl_bgg4u9) pkg> add ImageMagick_jll@6.9.11
Updating registry at `~/.julia/registries/General`
â Error: Some registries failed to update:
â â `~/.julia/registries/General` â registry detached
â @ Pkg.Registry ~/Documents/GitHub/julia/usr/share/julia/stdlib/v1.10/Pkg/src/Registry/Registry.jl:476
Resolving package versions...
Installed ImageMagick_jll â v6.9.11+3
Downloaded artifact: ImageMagick
Updating `/tmp/jl_bgg4u9/Project.toml`
â [c73af94c] + ImageMagick_jll v6.9.11+3
Updating `/tmp/jl_bgg4u9/Manifest.toml`
[692b3bcd] + JLLWrappers v1.4.1
[21216c6a] + Preferences v1.3.0
[61579ee1] + Ghostscript_jll v9.55.0+3
â [c73af94c] + ImageMagick_jll v6.9.11+3
[aacddb02] + JpegTurbo_jll v2.1.2+0
[88015f11] + LERC_jll v3.0.0+1
[89763e89] + Libtiff_jll v4.4.0+0
[d3a379c0] + LittleCMS_jll v2.12.0+0
[643b3616] + OpenJpeg_jll v2.4.0+0
[3161d3a3] + Zstd_jll v1.5.2+0
[b53b4c65] + libpng_jll v1.6.38+0
[0dad84c5] + ArgTools v1.1.1
[56f22d72] + Artifacts
[2a0f44e3] + Base64
[ade2ca70] + Dates
[f43a241f] + Downloads v1.6.0
[7b1f6079] + FileWatching
[b77e0a4c] + InteractiveUtils
[8f399da3] + Libdl
[56ddb016] + Logging
[d6f4376e] + Markdown
[ca575930] + NetworkOptions v1.2.0
[44cfe95a] + Pkg v1.10.0
[de0858da] + Printf
[3fa0cd96] + REPL
[9a3f8284] + Random
[ea8e919c] + SHA v0.7.0
[9e88b42a] + Serialization
[6462fe0b] + Sockets
[fa267f1f] + TOML v1.0.3
[a4e569a6] + Tar v1.10.0
[cf7118a7] + UUIDs
[4ec0a83e] + Unicode
[deac9b47] + LibCURL_jll v7.84.0+0
[29816b5a] + LibSSH2_jll v1.10.2+0
[c8ffd9c3] + MbedTLS_jll v2.28.0+0
[14a3606d] + MozillaCACerts_jll v2022.10.11
[83775a58] + Zlib_jll v1.2.13+0
[8e850ede] + nghttp2_jll v1.48.0+0
[3f19e933] + p7zip_jll v17.4.0+0
Info Packages marked with â have new versions available and may be upgradable.
Precompiling environment...
1 dependency successfully precompiled in 1 seconds. 13 already precompiled.
(jl_bgg4u9) pkg> add Example
Resolving package versions...
Updating `/tmp/jl_bgg4u9/Project.toml`
[7876af07] + Example v0.5.3
Updating `/tmp/jl_bgg4u9/Manifest.toml`
[7876af07] + Example v0.5.3
(jl_bgg4u9) pkg> rm Example
Updating `/tmp/jl_bgg4u9/Project.toml`
[7876af07] - Example v0.5.3
Updating `/tmp/jl_bgg4u9/Manifest.toml`
[7876af07] - Example v0.5.3
shell> git checkout 3dd229c080a140cf67146ceb3ed6dba8add22ddf
Previous HEAD position was ae8ec3b695 New version: OpenAPI v0.1.7 (#77093)
HEAD is now at 3dd229c080 New version: ImageMagick_jll v6.9.11+4 (#77094)
(jl_bgg4u9) pkg> add Example
Resolving package versions...
Installed ImageMagick_jll â v6.9.11+4
Downloaded artifact: ImageMagick
Updating `/tmp/jl_bgg4u9/Project.toml`
[7876af07] + Example v0.5.3
â [c73af94c] â ImageMagick_jll v6.9.11+3 â v6.9.11+4
Updating `/tmp/jl_bgg4u9/Manifest.toml`
[7876af07] + Example v0.5.3
â [c73af94c] â ImageMagick_jll v6.9.11+3 â v6.9.11+4
Info Packages marked with â have new versions available and may be upgradable.
Precompiling environment...
1 dependency successfully precompiled in 1 seconds. 14 already precompiled.
Issue tracked at Feature request: support version numbers with build metadata ¡ Issue #1568 ¡ JuliaLang/Pkg.jl ¡ GitHub
So I think a reasonable strategy here is:
- Make
pkg> add
no longer update the registry automatically - Make Pkg aware of build numbers, so it can know not to update them appropriately
- Add a
pkg> update --noreg
mode to do an env update without updating the registry