What is "incremental compilation" and what does it mean for it to be "broken"?

I’ve seen posts talking about a warning “incremental compilation may be fatally broken for this module”. Sometimes it’s not clear why, sometimes it happens because they overwrote a method in a package. What does this mean exactly? I tried reading about incremental compilers that only recompile the edited parts of source code, but the details are pretty complicated and different across compilers, so I figured I would ask about Julia specifically. The broad concept reminds me of method invalidation and Revise.jl, but I don’t see how that could possibly break.

2 Likes

Consider three packages:

module A
f() = 1
end
module B
using A
A.f() = 2
end
module C
using A
A.f() = 3
end

Now the behavior of your system depends on which order you load these three packages; the last to define A.f() “wins.” This is broken behavior because Julia tries hard to ensure that the order of loading packages and compilation does not matter to final outcome (this is why we have invalidation).

Even more simply, it’s really bad form to change the behavior of a method. “Here’s how package A works, unless you happen to load B” is not scalable for a large ecosystem.

To address the meaning of “incremental compilation”: it’s the compilation that can be done when only a subset of the entire codebase is available. When we precompile a package, we don’t know what other packages it will be loaded with, and again this is why we need invalidation. This is very different from how C (the programming language, not the module above), for example, compiles, where first you define the entire “universe” of code and then it compiles without ever having to worry about what else might happen. But even in C there is a similar issue: if you’re building a shared library, the linker will complain if you load two libraries that define the same symbol. It’s basically the same warning as what you’re seeing here.

18 Likes

Would it be good to add a short version of your explanation to that warning, to reduce confusion and give people something actionable to do when they encounter this? As is, the warning isn’t really helpful if you’re not already deep in the weeds on how julia compilation works under the hood.

5 Likes

Maybe a precompilation section or page in the Pkg docs or something? The warnings already write a couple sentences, and I don’t think that little text could ever really get the point across.

2 Likes

Some points for clarification:

  1. Is it correct to say incremental compilation means precompilation of each package in isolation?

  2. Even if it’s broken, it would never stop our program from running, right? For example, loading (and invalidating) A,B,C in any order results in a functional program, even if the behavior differs by order.

  3. What does happen to precompilation when this is broken? Does loading ignore or delete cached precompiled code?

  1. Am I correct to think this is limited to type piracy altering the dispatch behavior? IIRC from a previous thread, extending a function without type piracy should not cause method invalidations, so shouldn’t package X precompile fine regardless of whether package Y has added X.foo(::Y.WhyType) yet?

  2. So far this has been about separate packages competing to determine a method’s behavior, but this warning has shown up for a constructor duplicated within a few lines of StringDistances.jl. Why would that be a problem, shouldn’t the method dispatch have been sorted out before the package precompiles? There’s also an odder example of the overwriting happening in the same line in Flux.jl, but that went away so I assume it was a patched Julia bug.

1 Like

Note that it’s possible to enable warnings for overwritten methods with the --warn-overwrite=yes command line flag.

2 Likes

I think a bunch of messages like this could do with ending up in this form (rust-inspired):

Warning: something is dodgy
  for more information see `explaincode(:J123)`

Or if isinteractive() is false

Warning: something is dodgy
  for more information see `julia --explain J123`
3 Likes

Is that bad form too?

Isn’t that the whole point of multiple dispatch to change or extend behavior of methods of different packages?

julia> module A
         f(x) = 1
       end
Main.A

julia> module B
         using Main.A
         A.f(x::Int) = 2
       end
Main.B

julia> using Main.B

julia> Main.A.f(1)
2

julia> Main.A.f(1.0)
1

Yes

Correct. But order in which you load packages is not something that we want mattering (it can introduce deadlocks in package dependency graphs, where two simultaneously-loaded packages need A and B in opposite orders), hence the scary warning.

invalidation, which means ignore

You can only overwrite a method if you’re committing type piracy (all the types must be defined in the first package to load, otherwise they can’t have the same signature). Hence this warning only happens if you’re committing type piracy.

Not necessarily; compilation might happen any time you execute code at top level, e.g.,

module MyPackage
foo() = 1
const v = vcat(foo(), 2)
foo() = 2
end

compiles foo and then overwrites it during precompilation. At the end of the process v is not consistent with what you’d expect given the (current) definition of foo.

3 Likes

Your example is type-piracy: B owns neither A.f nor Int. That method needs to be in A, not B. If B defines new types, then it can extend A.f to operate on those types; there’s no way A could know about those types.

4 Likes

But, you can commit type piracy and invalidate a method without overwriting it or its callees. Or by “overwrite” did you mean invalidation rather than definition?

julia> foo(::Integer) = 0; # not overwritten
julia> bar(x) = foo(x); # not overwritten
julia> bar(1)
0
julia> foo(::Int) = 1; # define new method
julia> bar(1) # bar(::Int) was invalidated
1

If broken incremental compilation means invalidation of precompiled code, then I imagine foo(::Int) = 1 invalidating bar(1) would do that just like foo() = 2 invalidating vcat(foo(), 2)?

Didn’t seem like the method was executed between definitions in the linked example. Here’s the patch that fixed broken incremental compilation:

struct Normalized{T <: Union{StringSemiMetric, StringMetric}} <: StringSemiMetric
    dist::T
end
#- Normalized(dist::Union{StringSemiMetric, StringMetric}) = Normalized{typeof(dist)}(dist)
Normalized(dist::Normalized) = dist

The package was small enough to read over, and it doesn’t seem like there was a repeated include somewhere, either. Do type constructors compile the default constructor upon definition? (MethodAnalysis.methodinstances doesn’t seem to work on type constructors or functor instances). Whichever the case, does this count as type piracy if it’s all in 1 package?

I’ll note that your first sentence is consistent with what Tim Holy is saying, so I’m not sure what’s your point.

When tim.holy said “change the behavior of” or “overwrite” a method, I interpreted it as something like this:

foo(::Int) = 0
foo(x::Int) = x # foo(::Int) method gets replaced

as opposed to this

foo(::Integer) = 0
foo(x::Int) = x # foo(::Integer) method is unchanged
1 Like

Yes, that’s how “overwrite” is used.

And my point was that type piracy can occur without overwriting any existing method, at least given my understanding of what overwriting means, so I was asking for clarification. That’s almost verbatim to what I said before so I don’t understand the disconnect here.

1 Like

But none of those make behavior dependent on the order of package loading: if B and C do not overwrite A.f or each other, then behavior is consistent no matter what order you load packages in. In other words, this is not a warning about invalidation or type-piracy: it’s a warning that you may have a broken system. The cache files contain method definitions, not just native code, and it’s the state of the method definitions that is most strongly at risk here.

4 Likes

Ah, I was under the impression any invalidations cause the warning, but it’s only a particular subset. So to recap, invalidations toss out package-wise precompiled code, which is called incremental compilation, but only the subset of invalidations with direct method replacement will also break it, which means behavior varies by package loading order.