Introduction
When using Julia nightly, to be Julia 1.11, I noticed that loading Pkg.jl causes method invalidations. As of Julia 1.11, Pkg.jl is no longer part of the system image and is thus a normal package. Thus, Pkg.jl is now capable of invalidating methods compiled into the system image.
Method invalidations occur when new methods are introduced that force the recompilation of previously compiled code. Typically this involves methods that specialized on abstract rather than concrete types resulting in type unstable code. Recompilation of code adds to compilation latency.
Pkg.jl is a package that we all use. This creates a unique opportunity for a case study for how invalidations occur and how they may be resolved. To observe the invalidations yourself, download a nightly build and follow the steps below.
Eliminating the invalidations will not only reduce compilation latency for Pkg.jl but also other packages which depend on the invalidated methods which would need to be recompiled.
Finding the Invalidations
You can find invalidations using SnoopCompileCore.jl and display them using SnoopCompile.jl. Specifically the @snoopr macro will return invalidations. See the SnoopCompile documentation for a guide on the details on how to do this. Hereโs how I applied SnoopCompile[Core] to Pkg.jl:
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.11.0-DEV.1340 (2024-01-19)
_/ |\__'_|_|_|\__'_| | Commit fb2d946b885 (0 days old master)
|__/ |
julia> using SnoopCompileCore
invalidations = @snoopr using Pkg
959-element Vector{Any}:
julia> using SnoopCompile
trees = invalidation_trees(invalidations)
3-element Vector{SnoopCompile.MethodInvalidations}:
inserting ==(a::Union{Pkg.BinaryPlatforms.FreeBSD, Pkg.BinaryPlatforms.Linux, Pkg.BinaryPlatforms.MacOS, Pkg.BinaryPlatforms.Windows}, b::Base.BinaryPlatforms.AbstractPlatform) @ Pkg.BinaryPlatforms ~/src/julia-fb2d946b88/share/julia/stdlib/v1.11/Pkg/src/BinaryPlatforms_compat.jl:89 invalidated:
backedges: 1: superseding ==(x, y) @ Base Base.jl:177 with MethodInstance for ==(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.AbstractPlatform) (4 children)
inserting tags(::Pkg.BinaryPlatforms.UnknownPlatform) @ Pkg.BinaryPlatforms ~/src/julia-fb2d946b88/share/julia/stdlib/v1.11/Pkg/src/BinaryPlatforms_compat.jl:17 invalidated:
mt_backedges: 1: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.triplet(::Base.BinaryPlatforms.AbstractPlatform) (0 children)
2: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.os_version(::Base.BinaryPlatforms.AbstractPlatform) (1 children)
3: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.libgfortran_version(::Base.BinaryPlatforms.AbstractPlatform) (1 children)
4: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.cxxstring_abi(::Base.BinaryPlatforms.AbstractPlatform) (1 children)
5: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.libstdcxx_version(::Base.BinaryPlatforms.AbstractPlatform) (1 children)
6: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.arch(::Base.BinaryPlatforms.AbstractPlatform) (2 children)
7: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.os(::Base.BinaryPlatforms.AbstractPlatform) (2 children)
8: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.libc(::Base.BinaryPlatforms.AbstractPlatform) (2 children)
9: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.call_abi(::Base.BinaryPlatforms.AbstractPlatform) (2 children)
10: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.platforms_match(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.Platform) (17 children)
11: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for (::Base.BinaryPlatforms.var"#match_loss#50")(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.Platform) (41 children)
inserting print(io::Pkg.UnstableIO, arg::Union{SubString{String}, String}) @ Pkg ~/src/julia-fb2d946b88/share/julia/stdlib/v1.11/Pkg/src/Pkg.jl:49 invalidated:
backedges: 1: superseding print(xs...) @ Base coreio.jl:3 with MethodInstance for print(::Any, ::String) (3 children)
2: superseding print(io::IO, s::Union{SubString{String}, String}) @ Base strings/io.jl:250 with MethodInstance for print(::IO, ::String) (379 children)
1 mt_cache
There are three sources of method invalidations. The first two pertain to the Pkg.BinaryPlatforms module. The last source involves the UnstableIO type used internally by Pkg.
Pkg.BinaryPlatform Invalidations
Pkg.BinaryPlatforms is the remnant of an older BinaryPlatforms API that is now Base.BinaryPlatforms. Pkg.BinaryPlatforms exists mainly for compatibility with packages that used the old API. Downstream of these APIs are JLLs via JLLWrapper.jl and the standard library Artifacts.jl.
Pkg.BinaryPlatforms has five platform types
UnknownPlatformLinuxWindowsMacOSFreeBSD
For comparison, Base.BinaryPlatforms has a single platform type:
Platform
With a single Base.BinaryPlatforms.Platform type, less compilation is needed. Methods do not need to be specialized for the operating system specific types. However, a single Platform type means that alternate approaches to dispatching based on operating system are required.
All six platform types are subtypes of Base.BinaryPlatforms.AbstractPlatform. UnknownPlatform is a type with a unique implementation of Base.BinaryPlatforms.tags returning Dict{String,String}("os"=>"unknown"). The other four operating system specific types are just wrappers around a Base.BinaryPlatforms.Platform instance with the corresponding operating system.
Above you will notice that that the signature being invalidated involves == with AbstractPlatform arguments.
==(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.AbstractPlatform)
We can explore the backedges using AbstractTrees.print_tree.
julia> print_tree(trees[1].backedges[1])
MethodInstance for ==(::AbstractPlatform, ::AbstractPlatform) at depth 0 with 4 children
โโ MethodInstance for isequal(::AbstractPlatform, ::AbstractPlatform) at depth 1 with 3 children
โโ MethodInstance for Base.ht_keyindex2_shorthash!(::Dict{AbstractPlatform, Nothing}, ::AbstractPlatform) at depth 2 with 2 children
โโ MethodInstance for Base.ht_keyindex2_shorthash!(::Dict{AbstractPlatform, Nothing}, ::AbstractPlatform) at depth 3 with 0 children
โโ MethodInstance for setindex!(::Dict{AbstractPlatform, Nothing}, ::Nothing, ::AbstractPlatform) at depth 3 with 0 children
Here we see the invalidated methods originate from the use of a Dict{AbstractPlatform, Nothing}.
Iโm unsure exactly the origin of the use of that type, but it likely comes from Artifacts.jl:
The second set of invalidations originates from defining tags(::Pkg.BinaryPlatforms.UnknownPlatform). The affected methods point to their use in Artifacts.jl.
julia> print_tree(trees[2].mt_backedges[end-1].second; maxdepth=100)
MethodInstance for Base.BinaryPlatforms.platforms_match(::AbstractPlatform, ::Platform) at depth 0 with 17 children
โโ MethodInstance for (::var"#47#49"{Platform})(::AbstractPlatform) at depth 1 with 16 children
โโ MethodInstance for Base.mapfilter(::var"#47#49"{Platform}, ::typeof(push!), ::KeySet{AbstractPlatform, Dict{AbstractPlatform, Dict{String, Any}}}, ::Set{AbstractPlatform}) at depth 2 with 15 children
โโ MethodInstance for filter(::var"#47#49"{Platform}, ::KeySet{AbstractPlatform, Dict{AbstractPlatform, Dict{String, Any}}}) at depth 3 with 13 children
โ โโ MethodInstance for Base.BinaryPlatforms.select_platform(::Dict{AbstractPlatform, Dict{String, Any}}, ::Platform) at depth 4 with 12 children
โ โโ MethodInstance for Artifacts.var"#artifact_meta#11"(::Platform, ::typeof(artifact_meta), ::String, ::Dict{String, Any}, ::String) at depth 5 with 10 children
โ โ โโ MethodInstance for Core.kwcall(::@NamedTuple{platform::Platform}, ::typeof(artifact_meta), ::String, ::Dict{String, Any}, ::String) at depth 6 with 8 children
โ โ โ โโ MethodInstance for Artifacts.var"#artifact_meta#10"(::Platform, ::Nothing, ::typeof(artifact_meta), ::String, ::String) at depth 7 with 4 children
โ โ โ โ โโ MethodInstance for Core.kwcall(::@NamedTuple{platform::Platform}, ::typeof(artifact_meta), ::String, ::String) at depth 8 with 3 children
โ โ โ โ โโ MethodInstance for Artifacts.var"#artifact_hash#12"(::Platform, ::Nothing, ::typeof(artifact_hash), ::String, ::String) at depth 9 with 2 children
โ โ โ โ โโ MethodInstance for Artifacts.artifact_hash(::String, ::String) at depth 10 with 0 children
โ โ โ โ โโ MethodInstance for Artifacts.artifact_hash(::String, ::String) at depth 10 with 0 children
โ โ โ โโ MethodInstance for Artifacts.var"#artifact_meta#18"(::Pairs{Symbol, Platform, Tuple{Symbol}, @NamedTuple{platform::Platform}}, ::typeof(artifact_meta), ::SubString{String}, ::Dict{String, Any}, ::String) at depth 7 with 2 children
โ โ โ โโ MethodInstance for Core.kwcall(::@NamedTuple{platform::Platform}, ::typeof(artifact_meta), ::SubString{String}, ::Dict{String, Any}, ::String) at depth 8 with 1 children
โ โ โ โโ MethodInstance for Artifacts.artifact_slash_lookup(::String, ::Dict{String, Any}, ::String, ::Platform) at depth 9 with 0 children
โ โ โโ MethodInstance for Core.kwcall(::@NamedTuple{platform::Platform}, ::typeof(artifact_meta), ::String, ::Dict{String, Any}, ::String) at depth 6 with 0 children
โ โโ MethodInstance for Artifacts.var"#artifact_meta#11"(::Platform, ::typeof(artifact_meta), ::String, ::Dict{String, Any}, ::String) at depth 5 with 0 children
โโ MethodInstance for filter(::var"#47#49"{Platform}, ::KeySet{AbstractPlatform, Dict{AbstractPlatform, Dict{String, Any}}}) at depth 3 with 0 children
Part of the issue here is due to the use of AbstractPlatform, an abstract type, as the element type of a Vector or as the key type for a Dict. When the new AbstractPlatform subtypes are introduced in Pkg.BinaryPlatforms along with new specialized methods, prior methods involving AbstractPlatform are then invalidated.
Potential Solutions
- Move
Pkg.BinaryPlatformstoBasewithin the system image.
If the module exists withinBase, then we do not need to worry about the invalidations since they will be resolved when the system image is created.
Fix Pkg.BinaryPlatforms invalidations by moving module to Base by mkitti ยท Pull Request #52249 ยท JuliaLang/julia ยท GitHub - Eliminate the platform types created in
Pkg.BinaryPlatforms. Instead turn the names of the eliminated types into constructors forBase.BinaryPlatform.Platform.
Refactor Pkg.BinaryPlatforms compat, fix invalidations by mkitti ยท Pull Request #3736 ยท JuliaLang/Pkg.jl ยท GitHub - Do not subtype
Base.BinaryPlatforms.AbstractPlatform.
i. Rather makeUnknownPlatforma wrapper box around aBase.BinaryPlatform.Platform.
ii. Subtype a newPkg.BinaryPlatforms.AbstractPlatform
iii. Implement all methods by forwarding to the boxedBase.BinaryPlatform.Platformwithin.
iv. Define conversion fromPkg.BinaryPlatforms.AbstractPlatformtoBase.BinaryPlatforms.Platform.
Refactor Pkg.BinaryPlatforms to avoid invalidations, keep types by mkitti ยท Pull Request #3742 ยท JuliaLang/Pkg.jl ยท GitHub
The first approach is relatively simple, but would bloat the system image.
The second approach does not work because a few people actually use the operating system types in Pkg.BinaryPlatforms for dispatch.
The third approach is still being considered. After that third approach is implemented, then the only remaining subtype of Base.BinaryPlatforms.AbstractPlatform is Base.BinaryPlatforms.Platform. Within Base and Artifacts most methods could be redefined on Platform rather than AbstractPlatform. Once that is done, it may be possible re-unify the AbstractPlatforms without causing invalidations.
Pkg.UnstableIO invalidations
Pkg.UnstableIO is a subtype of IO that is used to prevent specialization for specific IO subtypes. It attempts to accomplish this by having a field with an abstract type, IO. The idea is that methods would specialize on UnstableIO but not be able to infer the specific IO subtype contained within. Certain methods such as write, get, and print are directly forwarded to the method within.
Introducing a new method for print on a new IO subtype invalidates many methods.
julia> print_tree(trees[3].backedges)
InstanceNode[MethodInstance for print(::Any, ::String) at depth 0 with 3 children, MethodInstance for print(::IO, ::String) at depth 0 with 379 children]
...
โโ MethodInstance for Base.var"#with_output_color#1077"(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(with_output_color), ::typeof(print), ::Symbol, ::IO, ::Any, ::String) at depth 1 with 7 children
โ โโ MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool}, ::typeof(with_output_color), ::typeof(print), ::Symbol, ::IO, ::Any, ::String) at depth 2 with 6 children
โ โโ MethodInstance for Base.var"#printstyled#1078"(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Symbol, ::typeof(printstyled), ::IO, ::Any, ::String) at depth 3 with 5 children
โ โโ MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool, color::Symbol}, ::typeof(printstyled), ::IO, ::Any, ::String) at depth 4 with 4 children
โ โฎ
โ
โโ MethodInstance for REPL.LineEdit.add_history(::REPLHistoryProvider, ::PromptState) at depth 1 with 0 children
Potential Solutions.
As with Pkg.BinaryPlatforms, I also proposed just moving UnstableIO to Base.
Jameson Nash noted that UnstableIO could just be IOContext{IO}. Therefore, I proposed eliminating the UnstableIO type and replacing it with IOContext{IO}. Keno also pointed out that compiler improvements could probably infer through UnstableIO, so explict an inferencebarrier is now required.
Summary
The root causes of these invalidations are the use of abstract types such as a AbstractPlatform and IO in a way that cannot be inferred to concrete subtypes.
The invalidations caused by loading Pkg.jl in Julia 1.11 have not been solved yet. I have proposed pull requests to do so as mentioned above. The solutions could be as simple as moving code between packages or as complicated as rearranging type hierarchies.
The invalidated methods are in widespread use and are used by JLLs and users of print(io::IO, ...). Fixing these invalidations is necessary to reduce compilation latency in the Julia 1.11 ecosystem.