How to help reduce package load latency?

Hi, I want to see if there’s any way to… proceed in a useful fashion.

It seems like the “normal workflow” for improving package latency, fails in this case. I watched a video, and tried to use some tools that have clearly been worked on very hard, and it seems like they are not working on this case. Is it the tools are broken? Should I follow up with someone about that?

If I am honest, I think the compilation is surprisingly slow as well. On my computer, this single package takes 4-8 minutes to load. I am able to compile very large C++ projects in that amount of time. I would be happy to see what else is happening, if I knew how.

Since I am developing this package, something akin to incremental compilation would help a lot. Right now every time I add a single line debug statement the whole thing precompiles again, and I lose about 10 minutes.

That is a term in Julia and it refers to package-wise precompilation.

Did you try Julia 1.10-rc1? According to my experiance the pre-compilation is twice as fast.

I also decided to get a new laptop with a Ryzen 7 7840u CPU with 32GB RAM, which reduced my compile times by a factor of 2.5 compared to my old laptop.

Finally, on Linux you also can achieve better compilation speeds than on Windows (well, perhaps 20% better, it depends).

1 Like

I’m confused about why you would wait for precompilation over a debug statement. Are you using Revise.jl as in the recommended Revise.jl based workflow?

https://docs.julialang.org/en/v1/manual/workflow-tips/#Revise-based-workflows

This performs in memory incremental compilation and is useful for debugging.

There is also the VSCode integrated debugger:
https://www.julia-vscode.org/docs/stable/userguide/debugging/

As for diagnosing compilation, the --trace-compile=stderr swtich will provide some basic information of what is being compiled.

$ julia --trace-compile=stderr -e "using Pkg; Pkg.activate()"                                                                           precompile(Tuple{typeof(Pkg.API.activate)})
precompile(Tuple{Pkg.API.var"##activate#295", Bool, Bool, Bool, Base.TTY, typeof(Pkg.API.activate)})                                                      Activating project at `~/.julia/environments/v1.9`
precompile(Tuple{Type{Base.Generator{I, F} where F where I}, Pkg.Types.var"#52#55"{String, String}, Array{Any, 1}})
precompile(Tuple{typeof(Base.collect_similar), Array{Any, 1}, Base.Generator{Array{Any, 1}, Pkg.Types.var"#52#55"{String, String}}})
precompile(Tuple{Pkg.Types.var"#52#55"{String, String}, Base.Dict{String, Any}})
precompile(Tuple{Type{Array{Dates.DateTime, 1}}, UndefInitializer, Tuple{Int64}})                                                                       precompile(Tuple{typeof(Base.collect_to_with_first!), Array{Dates.DateTime, 1}, Dates.DateTime, Base.Generator{Array{Any, 1}, Pkg.Types.var"#52#55"{String, String}}, Int64})                                                       precompile(Tuple{typeof(Base.convert), Type{Base.Dict{String, Union{Array{String, 1}, String}}}, Base.Dict{String, Any}})                               precompile(Tuple{typeof(Base.setindex!), Base.Dict{String, Union{Array{String, 1}, String}}, Array{String, 1}, String})                                 precompile(Tuple{typeof(Core.Compiler.eltype), Type{Array{UInt64, 1}}})
precompile(Tuple{typeof(Base.deepcopy_internal), Tuple{UInt64}, Base.IdDict{Any, Any}})
precompile(Tuple{typeof(Base.deepcopy_internal), Tuple{String}, Base.IdDict{Any, Any}})

Additional insight is provided by the SnoopCompile.jl. This can help answer the question about why something is being compiled or recompiled.
https://timholy.github.io/SnoopCompile.jl/dev/

As a dynamic language with multiple dispatch, compilation, especially ahead-of-time, compilation can get quite complicated. In Julia, type inference is needed to determine exactly what to compile. Additionally, what was previously compiled can be invalidated due to the addition of new methods. SnoopCompile and friends can help identify inference and invalidation issues.

With precompiled pkgimages there are also some additional latency from validating the cached native compilation and loading it into the existing in-memory compilation. Creating a system image via PackageCompiler.jl can help avoid that latency but that process can be lengthy. Recent versions of Julia can reuse pkgimages to help create the system image.

1 Like

re: Revise. My current workflow is to have a julia REPL with Revise loaded at startup. I load my test file with includet(“test.jl”). Then I make a change to a file in my package, and save the change there. Then I reload with include(“test.jl”). This seems to re-precompile my entire package, which takes 4-10 minutes. Is this not the expected behavior from Revise?

I never had this problem, but I am also not following the official path. I clone the repo of my package and start julia with julia --project in the folder of the package. I never dev a package.

In addition I always create a sysimage that contains all packages but the package I am working on.

Avantage: Fast edit compile run cycle. I try to make sure that even restarting Julia does not take more than 5s.

Disadvantage: At one point in time I can only work on one package and not edit sub-packages.

To correct myself, using KiteModels after a code change triggers a recompilation that takes 13s in my case even if Revise is loaded. Perhaps somebody else can explain how to avoid that?

Don’t do that. Use the debugger in VSC, or even if you add printf messages just run test from the repl. The problem with the VSC debugger is that it’s annoying in not adapting to add/remove lines and start stopping in wrong lines. Than, no other option but to kill the repl and restart … which than pre-compiles again and waiiiiit

If you are developing a package, you should not be using includet.

Say I have a package in development at /home/mkitti/.julia/dev/MyPackage. That folder contains my Project.toml, Manifest.toml file, and a src directory where my code is located.

julia> using Pkg

julia> Pkg.activate(); Pkg.add("Revise") # add Revise to my shared @1.9 environment

julia> Pkg.generate("/home/mkitti/.julia/dev/MyPackage") # do this once to create the package

julia> cd("/home/mkitti/.julia/dev/MyPackage")

julia> using Pkg; Pkg.activate(".")

julia> using Revise, MyPackage # Revise should come first
[ Info: Precompiling MyPackage [49e29ff8-36dd-4dd5-81b7-937a02e2e49b]

julia> MyPackage.greet()
Hello world!

julia> edit("src/MyPackage.jl")

julia> MyPackage.greet() # no extra precompilation needed 
¡Hola Mundo!
9 Likes

Is there a way to transition this part of the discussion to a new thread? This sounds like very useful information, but it is not related to the thread title.

Is the difference between what you suggest and what I’m currently doing, coming from activating the package first? Or is it coming from the fact that greet is a function inside the package? (I will point out that the website you linked specifically recommends “Put your test code in another file” as one of the bullet points).

I’m confused as to what’s wrong with my workflow. Doesn’t includet tell Revise to track changes in the file included and all packages it depends on? Does it do so, but only in a suboptimal way that triggers re-precompilation?

Only the file. Revise tracks changes to imported devved packages separately.

The include("test.jl") is not part of Revise’s usage. Revise’s tracking should cause the possible changes upon file saves. If any methods were invalidated by the changes or there are new methods, they are compiled upon the next runtime calls.

Just to clarify, are you importing your package in test.jl (what does its import statement look like), and what else are you doing in it? That could help explain why the errant include step is doing the 4-10 minutes of recompilation. What are the REPL printouts while this happens, if any?

You should not be using include to reload the code. Revise will track changes for files on disk. It should not be recompilating the entire package. If you aee using include as you say, you have effectively created a new module independent from the prior one.

In particular, Revise tracks the following.

  1. Files included via includet
  2. If you have activated a package environment, it will track changes to that package.
  3. It will track changes to packages you have inlcuded in your environment via Pkg.develop or ] dev.
  4. It can also track changes to Base

See details here:

Yes, in part. I am activating a package environment and using a module of the same name. Revise.jl will then track changes to that package. includet can be useful for loading additional code, but Revise.jl will not transitively track code that you have included via include.

Part of your question appear to be about reducing package loading latency during development along with a request for incremental compilation. Revise.jl addresses this case directly. Your description of your experience does not seem match correct usage of Revise.jl.

I will also note that Revise.jl and SnoopCompile.jl share the same author, @tim.holy, and these packages are meant to be used together.

In this thread you have not shared much code, which is limiting thr degree of concrete advice we can provide you.

If you would like, take the MyPackage example, and expand it to mimic your current workflow. If you then share the corresponding code with us, such as what you are using includet with we can help debug your workflow.

1 Like

Very helpful advice from @mkitti (thanks), but let me correct one thing:

I don’t really think they are meant to be used together; SnoopCompile is about analyzing sources of latency, Revise is about workflows that avoid latency by keeping your session running. Certainly, though, they are both useful for people who want to reduce latency.

2 Likes

Well the documetation does say

Finally, another alternative for reducing latency without any modifications to package files is Revise. It can be used in conjunction with SnoopCompile.

Maybe I should eliminate that. I was just listing various tools aimed at reducing latency, but otherwise there’s almost no overlap between the two, and I imagine (I can’t honestly remember) that I was just letting people know that you don’t have to choose between tools.

An update: After adding a long precompile_workload, and upgrading to Julia version 1.10, the package load time is ~39 seconds on my reasonably fast desktop. This is down from about 140 seconds in Julia 1.9 (no change to the code), and down from about 9 minutes without the precompile_workflow in Julia 1.0.

3 Likes

is there a reason you don’t put half those dependencies in extensions? Does every ParameterEstimation.jl workflow need all of those packages? If not its a pretty easy win to split them up and get users to specifically import the pacakges needed for different functionality.

1 Like

The honest answer is that I don’t know how and it is not my first priority.

However, in this particular case, we are loading very large packages for a few functions of their functionality, but we really need those. ParameterEstimation.jl actually only exports a few symbols, and represents just a few algorithmic ideas, but they really do need linear algebra, differential equations, and groebner bases.

1 Like

Splitting in independent packages or their extensions is only feasible if the code has many independent parts. That doesn’t apply to a concise API with an inherently large codebase, and more code naturally takes longer to compile and load. Saving the standard-issue compiled code is the best you can do, and 9 minutes to 39 seconds is a great job.