Case Study: Method Invalidations caused by Pkg.jl with Julia 1.11

I think the solution there is unimethod functions Unimethod Functions · Issue #23095 · JuliaLang/julia · GitHub. If there are some use cases which want to optimize runtime performance with type-instability and tend to only have a lot of unimethod functions, then that kind of use case should have the ability to opt functions into a declaration that they should be optimized as non-dispatching and thus not perform multiple dispatch. Maybe just a macro on the function declaration. But if that’s the case, then I would like to see an error message thrown if you try to define a second method.

I think where this goes wrong is that some use cases need fast single dispatch functions on type unstable code, so the compiler optimizes all single dispatch functions on type unstable code. That compiler optimization is then relatively easily mis-applied though. I think a nice example of that (::Any == ::Any)::Bool which is perfectly fine in Base as the only return type from == in Base, but then any symbolic representation of code turns == into something symbolic and every code with a type-unstable == recompiles due to Symbolics existing. The problem is, if == “should” only ever return a Boolean (which I don’t think should be the case, there’s other counter examples), then we should get an error for violating the rule. This compiler optimization creates “unwritten rules” which if broken you get 2 minute compile and load times, so technically you can make the dispatch but in practice you know that in any widely used code you cannot do that. An example of this is{T}, x) for any concrete T (IIRC) is a major invalidator, so in theory you can make this dispatch but in practice you cannot because of the compiler heuristics and invalidation.

Those cases are not single dispatch cases, but what I mean by all of that is that if we need some cases to apply more compiler optimizations on type unstable code, then I think we need to give users the ability to specify those functions or that module as “optimize this more”. Currently we do this the other way around and default to performing all of these optimizations given what is seen in Base, but I don’t think assumptions built on the Base library are a good idea. “There is only one dispatch of this function in Base so therefore there will only be one dispatch of this function” or “All dispatches of this function in Base have the same return type so therefore assume all dispatches of this function will have the same return type” is not something I think you can extrapolate well from Base. Base is just too small of a sample of what Julia code is like to understand these behaviors. So I would much prefer we optimize like that less by default but let people opt certain functions or modules into such assumptions.

An aside, I think it would be interesting if during the system image build if we could for example load up all of JuMP, SciML, etc. and then do the optimizations based on what we know about the dispatches that exist in the wild. I don’t think this is practical, but I think this is the kind of thing you’d actually have to do in order to know if a compiler optimization of this sort would cause invalidations.

Also an aside, I’ve thought about adding “fake” dispatches to Base as a way to de-optimize a function. For example with the == example, we could define 4 singleton types and then do (::T == ::T)::T on those 4, which then gives Base enough methods that it won’t apply the optimization and then we won’t get invalidations downstream. We could in theory do that on all of the major invalidators. I don’t know if people would think that’s too much of a hack though.

1 Like