Case Study: Method Invalidations caused by Pkg.jl with Julia 1.11

Introduction

When using Julia nightly, to be Julia 1.11, I noticed that loading Pkg.jl causes method invalidations. As of Julia 1.11, Pkg.jl is no longer part of the system image and is thus a normal package. Thus, Pkg.jl is now capable of invalidating methods compiled into the system image.

Method invalidations occur when new methods are introduced that force the recompilation of previously compiled code. Typically this involves methods that specialized on abstract rather than concrete types resulting in type unstable code. Recompilation of code adds to compilation latency.

Pkg.jl is a package that we all use. This creates a unique opportunity for a case study for how invalidations occur and how they may be resolved. To observe the invalidations yourself, download a nightly build and follow the steps below.

Eliminating the invalidations will not only reduce compilation latency for Pkg.jl but also other packages which depend on the invalidated methods which would need to be recompiled.

Finding the Invalidations

You can find invalidations using SnoopCompileCore.jl and display them using SnoopCompile.jl. Specifically the @snoopr macro will return invalidations. See the SnoopCompile documentation for a guide on the details on how to do this. Hereโ€™s how I applied SnoopCompile[Core] to Pkg.jl:

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.0-DEV.1340 (2024-01-19)
 _/ |\__'_|_|_|\__'_|  |  Commit fb2d946b885 (0 days old master)
|__/                   |

julia> using SnoopCompileCore
       invalidations = @snoopr using Pkg
959-element Vector{Any}:

julia> using SnoopCompile
       trees = invalidation_trees(invalidations)
3-element Vector{SnoopCompile.MethodInvalidations}:
 inserting ==(a::Union{Pkg.BinaryPlatforms.FreeBSD, Pkg.BinaryPlatforms.Linux, Pkg.BinaryPlatforms.MacOS, Pkg.BinaryPlatforms.Windows}, b::Base.BinaryPlatforms.AbstractPlatform) @ Pkg.BinaryPlatforms ~/src/julia-fb2d946b88/share/julia/stdlib/v1.11/Pkg/src/BinaryPlatforms_compat.jl:89 invalidated:
   backedges: 1: superseding ==(x, y) @ Base Base.jl:177 with MethodInstance for ==(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.AbstractPlatform) (4 children)

 inserting tags(::Pkg.BinaryPlatforms.UnknownPlatform) @ Pkg.BinaryPlatforms ~/src/julia-fb2d946b88/share/julia/stdlib/v1.11/Pkg/src/BinaryPlatforms_compat.jl:17 invalidated:
   mt_backedges:  1: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.triplet(::Base.BinaryPlatforms.AbstractPlatform) (0 children)
                  2: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.os_version(::Base.BinaryPlatforms.AbstractPlatform) (1 children)
                  3: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.libgfortran_version(::Base.BinaryPlatforms.AbstractPlatform) (1 children)
                  4: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.cxxstring_abi(::Base.BinaryPlatforms.AbstractPlatform) (1 children)
                  5: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.libstdcxx_version(::Base.BinaryPlatforms.AbstractPlatform) (1 children)
                  6: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.arch(::Base.BinaryPlatforms.AbstractPlatform) (2 children)
                  7: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.os(::Base.BinaryPlatforms.AbstractPlatform) (2 children)
                  8: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.libc(::Base.BinaryPlatforms.AbstractPlatform) (2 children)
                  9: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.call_abi(::Base.BinaryPlatforms.AbstractPlatform) (2 children)
                 10: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for Base.BinaryPlatforms.platforms_match(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.Platform) (17 children)
                 11: signature Tuple{typeof(Base.BinaryPlatforms.tags), Base.BinaryPlatforms.AbstractPlatform} triggered MethodInstance for (::Base.BinaryPlatforms.var"#match_loss#50")(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.Platform) (41 children)

 inserting print(io::Pkg.UnstableIO, arg::Union{SubString{String}, String}) @ Pkg ~/src/julia-fb2d946b88/share/julia/stdlib/v1.11/Pkg/src/Pkg.jl:49 invalidated:
   backedges: 1: superseding print(xs...) @ Base coreio.jl:3 with MethodInstance for print(::Any, ::String) (3 children)
              2: superseding print(io::IO, s::Union{SubString{String}, String}) @ Base strings/io.jl:250 with MethodInstance for print(::IO, ::String) (379 children)
   1 mt_cache

There are three sources of method invalidations. The first two pertain to the Pkg.BinaryPlatforms module. The last source involves the UnstableIO type used internally by Pkg.

Pkg.BinaryPlatform Invalidations

Pkg.BinaryPlatforms is the remnant of an older BinaryPlatforms API that is now Base.BinaryPlatforms. Pkg.BinaryPlatforms exists mainly for compatibility with packages that used the old API. Downstream of these APIs are JLLs via JLLWrapper.jl and the standard library Artifacts.jl.

Pkg.BinaryPlatforms has five platform types

  1. UnknownPlatform
  2. Linux
  3. Windows
  4. MacOS
  5. FreeBSD

For comparison, Base.BinaryPlatforms has a single platform type:

  1. Platform

With a single Base.BinaryPlatforms.Platform type, less compilation is needed. Methods do not need to be specialized for the operating system specific types. However, a single Platform type means that alternate approaches to dispatching based on operating system are required.

All six platform types are subtypes of Base.BinaryPlatforms.AbstractPlatform. UnknownPlatform is a type with a unique implementation of Base.BinaryPlatforms.tags returning Dict{String,String}("os"=>"unknown"). The other four operating system specific types are just wrappers around a Base.BinaryPlatforms.Platform instance with the corresponding operating system.

Above you will notice that that the signature being invalidated involves == with AbstractPlatform arguments.

==(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.AbstractPlatform)

We can explore the backedges using AbstractTrees.print_tree.

julia> print_tree(trees[1].backedges[1])
MethodInstance for ==(::AbstractPlatform, ::AbstractPlatform) at depth 0 with 4 children
โ””โ”€ MethodInstance for isequal(::AbstractPlatform, ::AbstractPlatform) at depth 1 with 3 children
   โ””โ”€ MethodInstance for Base.ht_keyindex2_shorthash!(::Dict{AbstractPlatform, Nothing}, ::AbstractPlatform) at depth 2 with 2 children
      โ”œโ”€ MethodInstance for Base.ht_keyindex2_shorthash!(::Dict{AbstractPlatform, Nothing}, ::AbstractPlatform) at depth 3 with 0 children
      โ””โ”€ MethodInstance for setindex!(::Dict{AbstractPlatform, Nothing}, ::Nothing, ::AbstractPlatform) at depth 3 with 0 children

Here we see the invalidated methods originate from the use of a Dict{AbstractPlatform, Nothing}.

Iโ€™m unsure exactly the origin of the use of that type, but it likely comes from Artifacts.jl:

The second set of invalidations originates from defining tags(::Pkg.BinaryPlatforms.UnknownPlatform). The affected methods point to their use in Artifacts.jl.

julia> print_tree(trees[2].mt_backedges[end-1].second; maxdepth=100)
MethodInstance for Base.BinaryPlatforms.platforms_match(::AbstractPlatform, ::Platform) at depth 0 with 17 children
โ””โ”€ MethodInstance for (::var"#47#49"{Platform})(::AbstractPlatform) at depth 1 with 16 children
   โ””โ”€ MethodInstance for Base.mapfilter(::var"#47#49"{Platform}, ::typeof(push!), ::KeySet{AbstractPlatform, Dict{AbstractPlatform, Dict{String, Any}}}, ::Set{AbstractPlatform}) at depth 2 with 15 children
      โ”œโ”€ MethodInstance for filter(::var"#47#49"{Platform}, ::KeySet{AbstractPlatform, Dict{AbstractPlatform, Dict{String, Any}}}) at depth 3 with 13 children
      โ”‚  โ””โ”€ MethodInstance for Base.BinaryPlatforms.select_platform(::Dict{AbstractPlatform, Dict{String, Any}}, ::Platform) at depth 4 with 12 children
      โ”‚     โ”œโ”€ MethodInstance for Artifacts.var"#artifact_meta#11"(::Platform, ::typeof(artifact_meta), ::String, ::Dict{String, Any}, ::String) at depth 5 with 10 children
      โ”‚     โ”‚  โ”œโ”€ MethodInstance for Core.kwcall(::@NamedTuple{platform::Platform}, ::typeof(artifact_meta), ::String, ::Dict{String, Any}, ::String) at depth 6 with 8 children
      โ”‚     โ”‚  โ”‚  โ”œโ”€ MethodInstance for Artifacts.var"#artifact_meta#10"(::Platform, ::Nothing, ::typeof(artifact_meta), ::String, ::String) at depth 7 with 4 children
      โ”‚     โ”‚  โ”‚  โ”‚  โ””โ”€ MethodInstance for Core.kwcall(::@NamedTuple{platform::Platform}, ::typeof(artifact_meta), ::String, ::String) at depth 8 with 3 children
      โ”‚     โ”‚  โ”‚  โ”‚     โ””โ”€ MethodInstance for Artifacts.var"#artifact_hash#12"(::Platform, ::Nothing, ::typeof(artifact_hash), ::String, ::String) at depth 9 with 2 children
      โ”‚     โ”‚  โ”‚  โ”‚        โ”œโ”€ MethodInstance for Artifacts.artifact_hash(::String, ::String) at depth 10 with 0 children
      โ”‚     โ”‚  โ”‚  โ”‚        โ””โ”€ MethodInstance for Artifacts.artifact_hash(::String, ::String) at depth 10 with 0 children
      โ”‚     โ”‚  โ”‚  โ””โ”€ MethodInstance for Artifacts.var"#artifact_meta#18"(::Pairs{Symbol, Platform, Tuple{Symbol}, @NamedTuple{platform::Platform}}, ::typeof(artifact_meta), ::SubString{String}, ::Dict{String, Any}, ::String) at depth 7 with 2 children
      โ”‚     โ”‚  โ”‚     โ””โ”€ MethodInstance for Core.kwcall(::@NamedTuple{platform::Platform}, ::typeof(artifact_meta), ::SubString{String}, ::Dict{String, Any}, ::String) at depth 8 with 1 children
      โ”‚     โ”‚  โ”‚        โ””โ”€ MethodInstance for Artifacts.artifact_slash_lookup(::String, ::Dict{String, Any}, ::String, ::Platform) at depth 9 with 0 children
      โ”‚     โ”‚  โ””โ”€ MethodInstance for Core.kwcall(::@NamedTuple{platform::Platform}, ::typeof(artifact_meta), ::String, ::Dict{String, Any}, ::String) at depth 6 with 0 children
      โ”‚     โ””โ”€ MethodInstance for Artifacts.var"#artifact_meta#11"(::Platform, ::typeof(artifact_meta), ::String, ::Dict{String, Any}, ::String) at depth 5 with 0 children
      โ””โ”€ MethodInstance for filter(::var"#47#49"{Platform}, ::KeySet{AbstractPlatform, Dict{AbstractPlatform, Dict{String, Any}}}) at depth 3 with 0 children

Part of the issue here is due to the use of AbstractPlatform, an abstract type, as the element type of a Vector or as the key type for a Dict. When the new AbstractPlatform subtypes are introduced in Pkg.BinaryPlatforms along with new specialized methods, prior methods involving AbstractPlatform are then invalidated.

Potential Solutions

  1. Move Pkg.BinaryPlatforms to Base within the system image.
    If the module exists within Base, then we do not need to worry about the invalidations since they will be resolved when the system image is created.
    Fix Pkg.BinaryPlatforms invalidations by moving module to Base by mkitti ยท Pull Request #52249 ยท JuliaLang/julia ยท GitHub
  2. Eliminate the platform types created in Pkg.BinaryPlatforms. Instead turn the names of the eliminated types into constructors for Base.BinaryPlatform.Platform.
    Refactor Pkg.BinaryPlatforms compat, fix invalidations by mkitti ยท Pull Request #3736 ยท JuliaLang/Pkg.jl ยท GitHub
  3. Do not subtype Base.BinaryPlatforms.AbstractPlatform.
    i. Rather make UnknownPlatform a wrapper box around a Base.BinaryPlatform.Platform.
    ii. Subtype a new Pkg.BinaryPlatforms.AbstractPlatform
    iii. Implement all methods by forwarding to the boxed Base.BinaryPlatform.Platform within.
    iv. Define conversion from Pkg.BinaryPlatforms.AbstractPlatform to Base.BinaryPlatforms.Platform.
    Refactor Pkg.BinaryPlatforms to avoid invalidations, keep types by mkitti ยท Pull Request #3742 ยท JuliaLang/Pkg.jl ยท GitHub

The first approach is relatively simple, but would bloat the system image.

The second approach does not work because a few people actually use the operating system types in Pkg.BinaryPlatforms for dispatch.

The third approach is still being considered. After that third approach is implemented, then the only remaining subtype of Base.BinaryPlatforms.AbstractPlatform is Base.BinaryPlatforms.Platform. Within Base and Artifacts most methods could be redefined on Platform rather than AbstractPlatform. Once that is done, it may be possible re-unify the AbstractPlatforms without causing invalidations.

Pkg.UnstableIO invalidations

Pkg.UnstableIO is a subtype of IO that is used to prevent specialization for specific IO subtypes. It attempts to accomplish this by having a field with an abstract type, IO. The idea is that methods would specialize on UnstableIO but not be able to infer the specific IO subtype contained within. Certain methods such as write, get, and print are directly forwarded to the method within.

Introducing a new method for print on a new IO subtype invalidates many methods.

julia> print_tree(trees[3].backedges)
InstanceNode[MethodInstance for print(::Any, ::String) at depth 0 with 3 children, MethodInstance for print(::IO, ::String) at depth 0 with 379 children]
...
   โ”œโ”€ MethodInstance for Base.var"#with_output_color#1077"(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(with_output_color), ::typeof(print), ::Symbol, ::IO, ::Any, ::String) at depth 1 with 7 children
   โ”‚  โ””โ”€ MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool}, ::typeof(with_output_color), ::typeof(print), ::Symbol, ::IO, ::Any, ::String) at depth 2 with 6 children
   โ”‚     โ””โ”€ MethodInstance for Base.var"#printstyled#1078"(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Symbol, ::typeof(printstyled), ::IO, ::Any, ::String) at depth 3 with 5 children
   โ”‚        โ””โ”€ MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool, color::Symbol}, ::typeof(printstyled), ::IO, ::Any, ::String) at depth 4 with 4 children
   โ”‚           โ‹ฎ
   โ”‚           
   โ””โ”€ MethodInstance for REPL.LineEdit.add_history(::REPLHistoryProvider, ::PromptState) at depth 1 with 0 children

Potential Solutions.

As with Pkg.BinaryPlatforms, I also proposed just moving UnstableIO to Base.

Jameson Nash noted that UnstableIO could just be IOContext{IO}. Therefore, I proposed eliminating the UnstableIO type and replacing it with IOContext{IO}. Keno also pointed out that compiler improvements could probably infer through UnstableIO, so explict an inferencebarrier is now required.

Summary

The root causes of these invalidations are the use of abstract types such as a AbstractPlatform and IO in a way that cannot be inferred to concrete subtypes.

The invalidations caused by loading Pkg.jl in Julia 1.11 have not been solved yet. I have proposed pull requests to do so as mentioned above. The solutions could be as simple as moving code between packages or as complicated as rearranging type hierarchies.

The invalidated methods are in widespread use and are used by JLLs and users of print(io::IO, ...). Fixing these invalidations is necessary to reduce compilation latency in the Julia 1.11 ecosystem.

26 Likes

One could also consider a deeper root cause to be the combination of:

  1. People donโ€™t write type-stable code consistently. If the authors of Pkg - presumably seasoned Julia core devs - doesnโ€™t create type-stable code, itโ€™s hard to have hope that the broader ecosystem will be type-stable
  2. Type unstable code is prone to invalidations from third-party code. This is due to compiler optimisations that try to regain performance from some type-unstable code

I fear the only reasonable solution to this problem is to remove the compiler optimisations mentioned in 2. If we donโ€™t, weโ€™re going to play invalidation whack-a-mole until the end of time, while also living with the fact that invalidations will continue to be rampant in the wider ecosystem - see e.g. this post where a single package causes 66,000 invalidations.

Also, having the optimisations in 2. defeats the idea that Julia devs can strategically choose to write type-unstable code where performance is not important. This is not actually a possibility if type instability across any larger piece of code causes uncontrollable invalidation and latency, even for code that is not performance sensitive.

10 Likes

Is there an option to opt out of these optimizations? Like saying on a module level, I do not want these optimizations in any of my code? Then we could at least bit by bit opt the ecosystem out of the invalidation-causing behavior, and when enough people have applied the opt out, we could think about disabling them ecosystem wide and it wouldnโ€™t have as much of a performance impact anymore.

Also it seems dangerous to rely on union splitting for abstract types just because there happen to be only 2 or 3 subtypes. If you were sure only those two or three could ever work with your code, you could annotate that as well and rely on union splitting that canโ€™t be invalidated.

2 Likes

Am I missing something or you showed a use of the Base.BinaryPlatforms API, and not of Pkg.BinaryPlatforms that youโ€™re trying to get rid of? I donโ€™t remember jlls using the old platform types, that was back in the BinaryProvider days, jlls should only use Platform, I think.

2 Likes

Pkg.BinaryPlatforms invalidates Base.BinaryPlatforms methods with AbstractPlatform, which the JLLs do use. Thatโ€™s part of the tragedy. An old compatibility module is invalidating methods used by the new API.

Before Julia 1.6, the JLLWrappers.jl do use the old API.

The two contemporary users of Pkg.BinaryPlatforms are @ararslan and AppBundler.jl.

I think we should formally deprecate Pkg.BinaryPlatforms and eventually move it out of Pkg. as an independent compat package.

I havenโ€™t taken a look myself, but one thing appears to be missing from this excellent analysis: if you use ascend to dig into the invalided items, are there opportunities for fixing things? Often one can fix invalidations by improving inference in the โ€œvictim.โ€ But there are cases involving deliberately-abstract inference where this is not possible.

2 Likes

IIUC this is possible for methods you define using Base.Experimental.@max_methods, but that wouldnโ€™t help for methods defined elsewhere?

help?> Base.Experimental.@max_methods
  Experimental.@max_methods n::Int

  Set the maximum number of potentially-matching methods considered when running inference for methods defined in the current module. This setting affects inference of calls with
  incomplete knowledge of the argument types.

  The benefit of this setting is to avoid excessive compilation and reduce invalidation risks in poorly-inferred cases. For example, when @max_methods 2 is set and there are two
  potentially-matching methods returning different types inside a function body, then Julia will compile subsequent calls for both types so that the compiled function body accounts
  for both possibilities. Also the compiled code is vulnerable to invalidations that would happen when either of the two methods gets invalidated. This speculative compilation and
  these invalidations can be avoided by setting @max_methods 1 and allowing the compiled code to resort to runtime dispatch instead.

  Supported values are 1, 2, 3, 4, and default (currently equivalent to 3).

  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

  Experimental.@max_methods n::Int function fname end

  Set the maximum number of potentially-matching methods considered when running inference for the generic function fname. Overrides any module-level or global inference settings for
  max_methods. This setting is global for the entire generic function (or more precisely the MethodTable).
3 Likes

While case-by-case you can fix, I do tend to think that the โ€œworld-splittingโ€ optimization is generally not a good idea for a default and from what I have seen, it gives minimal runtime benefits in comparison to the invalidation (and thus compile times) it causes. Julia has changed a lot since it was added, so I think we should really do a thorough analysis of whether it makes sense to normal codes in the standard 2024 Julia v1.10 environments.

9 Likes

I like the idea of reconsidering world-splitting, but one major issue is that the compiler, JuliaInterpreter, and other code that has no choice but to handle a lot of uncontrolled types heavily leverages the โ€œsingle method optimizationโ€: rather than having many methods and dispatching on them, just do manual dispatch via if isa(x, GlobalRef) ... elseif isa(s, Symbol) ... end. This is a really important performance optimization and weโ€™d need a substitute.

3 Likes

My assumption was that a lot of this was written to be type unstable on purpose, because Pkg used to be in the sysimage, so one wouldnโ€™t necessarily want all of itโ€™s precompiled methods to end up bloating the sysimage, imposting a cost on non-Pkg workflows.

6 Likes

I think the solution there is unimethod functions Unimethod Functions ยท Issue #23095 ยท JuliaLang/julia ยท GitHub. If there are some use cases which want to optimize runtime performance with type-instability and tend to only have a lot of unimethod functions, then that kind of use case should have the ability to opt functions into a declaration that they should be optimized as non-dispatching and thus not perform multiple dispatch. Maybe just a macro on the function declaration. But if thatโ€™s the case, then I would like to see an error message thrown if you try to define a second method.

I think where this goes wrong is that some use cases need fast single dispatch functions on type unstable code, so the compiler optimizes all single dispatch functions on type unstable code. That compiler optimization is then relatively easily mis-applied though. I think a nice example of that (::Any == ::Any)::Bool which is perfectly fine in Base as the only return type from == in Base, but then any symbolic representation of code turns == into something symbolic and every code with a type-unstable == recompiles due to Symbolics existing. The problem is, if == โ€œshouldโ€ only ever return a Boolean (which I donโ€™t think should be the case, thereโ€™s other counter examples), then we should get an error for violating the rule. This compiler optimization creates โ€œunwritten rulesโ€ which if broken you get 2 minute compile and load times, so technically you can make the dispatch but in practice you know that in any widely used code you cannot do that. An example of this is Base.show(::Type{T}, x) for any concrete T (IIRC) is a major invalidator, so in theory you can make this dispatch but in practice you cannot because of the compiler heuristics and invalidation.

Those cases are not single dispatch cases, but what I mean by all of that is that if we need some cases to apply more compiler optimizations on type unstable code, then I think we need to give users the ability to specify those functions or that module as โ€œoptimize this moreโ€. Currently we do this the other way around and default to performing all of these optimizations given what is seen in Base, but I donโ€™t think assumptions built on the Base library are a good idea. โ€œThere is only one dispatch of this function in Base so therefore there will only be one dispatch of this functionโ€ or โ€œAll dispatches of this function in Base have the same return type so therefore assume all dispatches of this function will have the same return typeโ€ is not something I think you can extrapolate well from Base. Base is just too small of a sample of what Julia code is like to understand these behaviors. So I would much prefer we optimize like that less by default but let people opt certain functions or modules into such assumptions.

An aside, I think it would be interesting if during the system image build if we could for example load up all of JuMP, SciML, etc. and then do the optimizations based on what we know about the dispatches that exist in the wild. I donโ€™t think this is practical, but I think this is the kind of thing youโ€™d actually have to do in order to know if a compiler optimization of this sort would cause invalidations.

Also an aside, Iโ€™ve thought about adding โ€œfakeโ€ dispatches to Base as a way to de-optimize a function. For example with the == example, we could define 4 singleton types and then do (::T == ::T)::T on those 4, which then gives Base enough methods that it wonโ€™t apply the optimization and then we wonโ€™t get invalidations downstream. We could in theory do that on all of the major invalidators. I donโ€™t know if people would think thatโ€™s too much of a hack though.

1 Like

Weโ€™ve done that in very rare cases, e.g., make convert(Union{},x) directly ambiguous by vtjnash ยท Pull Request #46000 ยท JuliaLang/julia ยท GitHub

2 Likes

I tried using Cthulu.jl but I did not really see anything very insightful there. Iโ€™m guessing its because the invalidated methods are from precompilation?

julia> using Cthulhu

julia> trees[1].backedges[1].mi
MethodInstance for ==(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.AbstractPlatform)

julia> ascend(trees[1].backedges[1].mi)
Choose a call for analysis (q to quit):
 >   ==(::Base.BinaryPlatforms.AbstractPlatform, ::Base.BinaryPlatforms.AbstractPlatform)
==(x, y) @ Base Base.jl:177
177 ==(x::Base.BinaryPlatforms.AbstractPlatform, y::Base.BinaryPlatforms.AbstractPlatform)::Bool = x::Base.BinaryPlatforms.AbstractPlatform === y::Base.BinaryPlatforms.AbstractPlatform
Select a call to descend into or โ†ฉ to ascend. [q]uit. [b]ookmark.
Toggles: [w]arn, [h]ide type-stable statements, [t]ype annotations, [s]yntax highlight for Source/LLVM/Native, [j]ump to source always.
Show: [S]ource code, [A]ST, [T]yped code, [L]LVM IR, [N]ative code
Actions: [E]dit source code, [R]evise and redisplay
 โ€ข โ†ฉ

I aslo run into errors such as the following.

julia> trees[2].mt_backedges[1].second.mi |> ascend
ERROR: MethodError: no method matching method(::Type{Tuple{typeof(Core.kwcall), NamedTuple, typeof(Base.Sort._sort!), AbstractVector, Base.Sort.ScratchQuickSort, Base.Order.Ordering, Any}})

Closest candidates are:
  method(::Vector{Base.StackTraces.StackFrame})
   @ Cthulhu ~/.julia/packages/Cthulhu/O1Xhq/src/backedges.jl:99
  method(::Core.MethodInstance)
   @ Cthulhu ~/.julia/packages/Cthulhu/O1Xhq/src/backedges.jl:93

julia> trees[2].mt_backedges[1].second.mi.backedges[2] |> ascend
Choose a call for analysis (q to quit):
     #collect_artifacts#41(::Base.BinaryPlatforms.AbstractPlatform, ::typeof(Pkg.Operations.collect_artifacts), ::String)
       kwcall(::NamedTuple{(:platform,), <:Tuple{Base.BinaryPlatforms.AbstractPlatform}}, ::typeof(Pkg.Operations.collect_artifacts), ::String)
         #download_artifacts#42(::Base.BinaryPlatforms.AbstractPlatform, ::Nothing, ::Bool, ::IO, ::typeof(Pkg.Operations.download_artifacts), ::Pkg.Types.EnvCache)
           kwcall(::NamedTuple{(:platform, :julia_version, :io), <:Tuple{Base.BinaryPlatforms.AbstractPlatform, Union{Nothing, VersionNumber}, IO}}, ::typeof(Pkg.Operations.download_artifacts), ::Pkg.Type
 >           #add#86(::Pkg.Types.PreserveLevel, ::Base.BinaryPlatforms.AbstractPlatform, ::Symbol, ::typeof(Pkg.Operations.add), ::Pkg.Types.Context, ::Vector{Pkg.Types.PackageSpec}, ::Set{Base.UUID})
             #develop#95(::Pkg.Types.PreserveLevel, ::Base.BinaryPlatforms.AbstractPlatform, ::typeof(Pkg.Operations.develop), ::Pkg.Types.Context, ::Vector{Pkg.Types.PackageSpec}, ::Set{Base.UUID})

Whatโ€™s the current status of type inference and keyword arguments? There are quite a few methods which have a keyword argument platform::AbstractPlatform = HostPlatform().

For example, Artifacts.artifact_meta or Pkg.Operations.add have platform keywords.

An important question here is what of this API if any is considered public API?