"Meaning", type-piracy, and method merging

Yes, that is indeed type-piracy. You are extending a method that you haven’t locally defined to take arguments where none of the argument types are locally defined. All of a sudden [1,2,3] * @SVector [1,2,3] will no longer throw an error.

You can either create a new function, or wrap one of the input arguments.

2 Likes

Great, thank you - I guess the wrapping is the key point here.

In my case I can just define a new function, but if I wanted to use somebody else’s module in this way, then I would need to wrap my input to make it clear which version of * I am using with affecting anybody else. Makes perfect sense.

This is not true. Methods don’t belong to modules: there’s just one global method table, and all of them are in there. Also, the compiler keeps track of exactly what analysis it ran and only needs to recompile the specific state that got changed. (this is the basis for “world errors” & the fix for issue #265)

3 Likes

I don’t understand why you think this. When they are in a package, your favorite operators are just a using/import away.

3 Likes

Thats part of the problem … the global method table is un-cachable so to speak since every module imported
may change a Base function by adding a method to it, potentially changing it behaviour, and thus invalidates
everything that depend on that function.

example:

module M
           import Base.+
           (+)(A::AbstractString,B::Number) = parse(A) + B
           myfunc(A,B) = A + B
       end
M

julia> M.myfunc("1",2)
3

julia> module N
           import Base.+
           (+)(A::String,B::Number) = parse(A) - B
       end
N

julia> M.myfunc("1",2)
-1

The lack of binary cache is a hindrance for big projects…imagine you were working on the linux kernel or openCV or any other huge project.
And you had to recompile everything, every time you exited the REPL or restarted a session.

So don’t do that? We could definitely use better tooling to identify when (and why) method invalidation is occurring, but a case like your modules M and N is so pathological that it’s hard to see how this is a big hindrance.

3 Likes

This seems like a textbook example for why type piracy is discouraged.

3 Likes

The example itself is not important , yes it is pathological … but it shows that you cannot do a “reliable” provable cache, that acts the same as it’s non cached counterpart

@mbauman @jameson
I am working on a POC for handling dispatch differently.
In a way that gurrantees that if a Module and it dependencies don’t change then it’s binary cache is not changed.

Plus as long as you don’t depend on type piracy as a feature , no changes are needed for the code.

Plus it might make runtime dispatch faster.
Maybe.

I have it working but it needs some polish to present it, so in a few days I’ll present all the details.

It doesn’t recompile everything, just whatever needs to change. Sure, that could be a big set of things, but the design doesn’t mandate recompiling everything. The granularity of most makefiles is similarly large; changing a comment in a C++ header could trigger arbitrary amounts of recompilation.

I think it would be nice to be able to “seal” a compiled module, in the sense that you can use it but not recompile anything in it even if a method it uses gets changed. That behavior, of course, is sort of the “default” in C, while the default in julia is to constantly re-run “make” for you.

This is what we already do (except that we don’t save native code to disk). If no dependencies change, of course we won’t recompile anything. Anyway this seems to me to be a matter of compiler features and implementing new execution models, not changing how dispatch works. Caching native code for packages would be great but it seems to me we could do it without changing the dispatch system.

8 Likes

That example is type-piracy, for sure, but that is just because he didn’t put his own type into the dispatch (i.e. extend a Base.+ with a method dispatching on M.MyType is the proper way implement a + in julia, since you can’t put it in the M namespace).

Of course, the compiler has no way to do whether the extension on Base.+ was type piracy or something innocuous, so it would have to assume that any extensions to Base could be changing the dispatching of existing code and recompile.

If I understand it correctly, what @TsurHerman is saying is that caching is an issue when someone extends an existing module (in particular, Base)? This is happening constantly with the Julia dispatch approach of different methods extending methods in a shared namespace, especially all the stuff added into Base or meta-packages for sharing method names. Doesn’t this make it impractical to “close” the Base namespace or packages with shared names? I thought this was the big issue in caching binary code without constantly recompiling with every permutation of the imported libraries, but maybe I missed something.

One thought: I suppose that instead of caching the Base module itself you could instead cache a string of using for all the packages used in a particular order for a given project. Then, the idea would be that if you loaded up the imports/using it could start from there for the compilation starting point.

If so, this is basically the pre-compiled headers approach used by some C++ compilers, which I would describe as a disaster. I think there are some good lessons there… the inclusion of header files in the C Preprocessor (which is very similar to Julia’s include ) ended up making it impossible to have smart precompilation at the module level. They have tried to fix this with Modules in the last decade, but I suspect it will be also be a failure.

No way to know, except the type signature? We don’t specifically detect “type piracy”, but we maybe could if we wanted to, and we certainly use the type signature to decide what to recompile.

3 Likes

I have a feeling that a scheme like the following could get rid of a lot of compilation overhead, and would be ugly, but simple:

When we need to compile something, we try to bring it into “canonical form” as fast as possible. Then we compute some cryptographic hash of the “canonical form”, and store the compilation result in a big persistent on-disk LRU cache. Whenever we need to compile beyond canonical form, we look up the LRU cache.

The “canonical form”, i.e. the “thing we hash” must contain hashes of all dependencies, i.e. everything that would cause recompilation. That is, we want a big fat hash-tree (well, Hash-DAG). I don’t know how the internal mechanism for invalidating cached compiled methods works today, and hence cannot say for certain whether it is suitable as input for a hash. But in principle, such a cache should be able to piggyback on the existing mechanism, and should be consistent as long as (1) the existing system is consistent and (2) the hash is collision-resistant.

The LRU cache would be size-limited but not garbage collected. I think 1GB of binary code + some overhead for data-structures goes a very long way, and people could just run their test-casses on a clear cache, and then freeze this cache as a “poor man’s precompilation” (depending on whether it would contain llvm or native code).

If or when you do that, provide the future with an easy way to include recognition of return types / types returned, which here would be with respect to change in the returning type. So, when where Julia gets to see what is Just-In-Time or to grok what is more timely yet or allows that uncanny knowing before it now know to occur, that is a smooth advantage-giving.

That makes me think of the way the Manifest.toml info is used in the new package system to freeze a set of packages into a reproducible state.

I’d love to see that capability, it’s the sort of thing that people who want to assure more predictable behavior from a software system would want.

It would be especially nice if cached compilation results (whether just to the level that the current precompilation works, or all the way to native code) could be 1) shared among all julia processes (only one copy in memory) 2) be able to be GCed.
We had that for MUMPS / Caché ObjectScript, and it made a huge difference in the amount of memory needed to run thousands of processes on a machine.
Could Julia package up either the .ji info and/or generated native code into a standard shared library, and then dynamically load that on any process that wanted to use it?
I’ve been thinking about using that technique for the data for Unicode tables, as well as HTML, LaTeX & Emoji entity tables, instead of loading them into each Julia processes.

It does not seem like it’s possible to have a productive conversation about this subject while ignoring repeated statements by people who ought to know, that method extension and multiple dispatch do not, in fact, cause “constant recompilation”. As far as I can recall, @TsurHerman and others here have simply ignored these statements and forged ahead, assuming, contrary to reality, that method extension is the main culprit in Julia’s compilation issues.

The amount of recompilation added by the fix for #265 is an upper bound on how much compilation one could save by changing or limiting method extension since before that we simply (incorrectly) didn’t recompile anything due to method extension. Even though Julia’s behavior was incorrect before that, it mostly worked—because the difference was actually not that huge. The upper bound isn’t very high: the fix of #265 did add some compilation time but not very much. Or put another way, Julia simply doesn’t do very much recompilation due to method extension.

If you think that some redesign of method extension or dispatch can allow everything to be compiled separately and in advance, let me pose a thought experiment:

  1. package A defines a higher order qsort(cmp::Function, v::Vector) much like C’s qsort,
  2. package B defines a comparison function, my_cmp, and
  3. C defines a type called FancyType.

If A, B and C are all precompiled modules, and some hypothetical redesign of Julia’s method extension and dispatch has guaranteed that no additional compilation is ever required, then which of these compiled modules provides the code that implements this function call:

A.qsort(B.my_cmp, [C.FancyType(k) for k = 1:100])

Where does that precompiled implementation of this A.qsort method live? Keep in mind that A does not know about B.my_cmp or about C.FancyType and likewise B and C do not know about A.

8 Likes

Yes, this is roughly what caching of generated code would need to look like. The bit about “hashing all dependencies” is the tricky part. There are also issues with reloading and using machine code in a new Julia process where globals may not be in the same location, which requires a Julia-level dynamic linker.

1 Like

Thanks for the explanation!

So, I do understand it right that the dependency-DAG already exists? That is, each entry in the compiled-method-cache needs upon creation register itself in the cache-entry of every dependency; and at the same time, it could grab the dependency’s (cached) hash. In other words, we need the reverse of the currently stored DAG (but we compute it anyway during creation).

Another question would be: When, during compilation, do we learn our dependencies? If this is very late (most cycles already spent), then this kind of cache becomes pretty useless.

I don’t think this is so tricky. I mean, we just want to build a persistent second-level cache for specialized methods. Each specialized method in the current (first-level) cache would need to store its own method_world_hash. Then, we just concatenate / Merkle-tree / whatever all the dependency’s method_world_hash, possibly in the same function call that registers us as a dependent. And then, we look into the second-level cache to see whether we can short-cut the rest of compilation.

During run-time, we would still rebuild the entire dependency-DAG and recompile everything as far as needed to figure out the dependencies. And once a Julia dynamic linker happens, if ever, everything becomes even faster; until then, we’d only skip a handful of llvm passes, unfortunately excluding machine-code generation.

I think inference cannot be easily cached this way, because we need it in order to resolve dependencies? I must confess that I don’t understand how current precompilation / sysimg deals with that.

Fortunately (for this purpose) the majority of time is spent optimizing LLVM IR and generating native code. So even if we had to completely repeat type inference and julia-level optimizations, there would still be significant savings.

2 Likes