Binary Caching and Pre-compilation .. again

I want to present here again and perhaps more clearly, the concept of context dispatch , or more precisely resolving multiple dispatch based on some context.
I believe this can contribute to precompilation and load times issues, and may further the understanding of how to create a reusable binary cache for the compiled code and where and how to store such data.

The building blocks:

  • A Context is a set of modules.

  • The Context of module M is the set of all modules that module M depends upon.

  • Extension: for the purpose of multiple dispatch , a module M can declare that it extends function fun from module B. This in ā€œJulianā€ terms is referred to as module M specifically imports function fun from module B. Any call to M.fun will redirect to B.fun and B.fun itself will dispatch a to a specific method considering also the definitions of B.fun in its method table.

  • Context Call: calling a function fun with a given context , the method table for fun will be determined by the modules that are present in the context (and that extend fun), the context is propagated recursively down the AST and every internal call is a context call.

There was previously some debate whether context call is feasible ā€¦ I made some heroic attempts to prove so,
but @jrevels attempts were even more heroic, coming up with Cassette.jl which made my previous attempts redundant. kudos.

Binary Caching

Given a certain fixed context, it is quite clear that we can cache all compilation outputs for functions called within that context as long as we donā€™t define additional functions in any of the dependent modules.

It is also quite clear that the context of module Main is not cacheable, since it is dynamic and changing from invocation to invocation.

The current state of affairs in Julia is that every call is equivalent to a context call with the context of Main. Which makes binary caching very hard if not impossible.

Possible solutions
All solutions should incorporate some heuristics for ā€œnarrowingā€ the context, since any context that is not Main is probably cacheable.

  • Narrow context based on types, calling function fun from module M with arguments types t1,t2ā€¦
    will narrow the context to the context of M if all types are instance-able from the context of M

  • Post compilation narrowing: log the hierarchy of modules ā€œvisitedā€ while compiling a function, log the compiled code in the smallest context that includes all visited modules. Then in a later stage if we encounter a function call from context M and there exist a binary cache for that function in M or in any of the modules inside Context of M ā€¦ use the binary cache instead of compiling.

  • other?

Further discussion

I feel this is a complicated subject better addressed in small pieces.

using Cassette.jl I can easily set up a POC for any kind of context dispatch rules, I will do so if there is interest from the devs and community.

5 Likes

This is an important topic which needs to be discussed further.

The primary example I like to think of is the Reduce.jl package. Specifically, I encounter issues related to pre-compilation in conjunction with context. The package has an environment flag which can be used to toggle extra pre-compilation scripts ENV["REDPRE"].

These extra pre-compilation scripts are an experimental feature, intended to run some examples and tests of the code to ensure that some of the most used methods are called and compiled at least once.

Enabling the extra precompilation leads to a minor improvement in overall performance of the package when it is initially loaded.

However, one negative side-effect is INSTABILITY of the entire Julia program, which can lead to a segfault when Reduce is used in conjunction with another precompiled program (solved below).

julia> using Reduce,StaticArrays
[ Info: Recompiling stale cache file /home/flow/.julia/compiled/v1.0/Reduce/wEGBP.ji for Reduce [93e0c654-6965-5f22-aba9-9c1ae6b3c259]
[ Info: Precompiling extra Reduce methods (set `ENV["REDPRE"]="0"` to disable)
Reduce (Free CSL version, revision 4590), 11-May-18 ...

julia> Reduce.Algebra.:+(:(x+y),:z)

signal (11): Segmentation fault
in expression starting at no file:0
unknown function (ip: 0x7fad3a307d26)
unknown function (ip: 0x7fad3a30c92f)
...

The Solution to the instability, I discovered last night, is to simply disable the additional pre-compilation scripts.

After disabling this extra precompilation, all the mysterious issues disappeared. Now, I am also able to use StaticArrays in conjunction with Reduce to do symbolic computations on static dual grassmann numbers with the Grassmann.jl package,


julia> using Reduce, Grassmann
Reduce (Free CSL version, revision 4590), 11-May-18 ...

julia> basis"2"
(++, e, eā‚, eā‚‚, eā‚ā‚‚)

julia> (:a*e1 + :b*e2) āˆ§ (:c*e1 + :d*e2)
0.0 + (a * d - b * c)eā‚ā‚‚

julia> (:a*e1 + :b*e2) * (:c*e1 + :d*e2)
a * c + b * d + (a * d - b * c)eā‚ā‚‚

The problem was that I had additional pre-compilation enabled by default for Linux users (an experimental feature with caching) and this caused a segfault. Disabling it by setting the ENV variable fixes that.

As @TsurHerman has pointed out, the context specific pre-compilation and caching continues to be a main issue with Julia going forwards. Many of the other important issues seem to have been worked out in Julia, but this context issue is something which continues to affect some users and developers.

Thus, I am also looking forward to further discussion on this topic.

So, maybe Iā€™m missing something, but Iā€™m not understanding how this:

is related to this:

It sounds like youā€™re grossly conflating two different issues; namely, 1: Your precompile scripts lead to a segfault, and 2: Precompilation is slow, annoy, and needs fixing somehow.

If you want to solve #1, I would recommend that you first break apart what this script is actually breaking/segfaulting on; to be honest, that is a beast that I donā€™t think anyone will be able to help you fix, and so maybe you need to reconsider this approach. Just my 2 cents.

If you want to solve #2, then someone needs to actually do this:

There have been quite a few threads started both here and on Juliaā€™s Github issue tracker relating to this idea of ā€œcontext dispatch solving binary caching/precompilation issuesā€, and frankly, I havenā€™t seen anything tangible actually come of it. There have also been rebuttals from a few of the core devs who do not believe that this approach has merit.

Of course, I am obviously not going to dissuade you from pursuing this approach if you so desire, and thatā€™s fine. However, I think if weā€™re going to have a productive discussion on the benefits and downsides of this approach, the following items will need to be presented:

  • A MWE of @TsurHerman 's ideas as actual working code
  • Tests for this code that ensure that pre-existing Base/Stdlib/package code doesnā€™t break under this new scheme (or is easily fixable, with plausible suggestions for how to do this)
  • Reproducible benchmarks of the time/memory/etc. savings of this new approach
  • Optional but helpful: A plan for how to incrementally move from Juliaā€™s current state of being, to this new approach, if we choose to do so

Now, Iā€™m obviously just a naysayer that wants to put down peopleā€™s revolutionary and clever ideas. However, I think youā€™ll find that if you present me with clean, working code and benchmarks that illustrates your points in practice, Iā€™ll very quickly come around and be willing to help out wherever I can :slightly_smiling_face:

And for anyone who would like some ā€œcontextā€ on this proposal and related discussions, here was everything I could find:

2 Likes

Itā€™s actually a third option, 3: you are uninformed about the details of my problem and what the causes of it are and how it relates.

Turns out I already did this exact testing myself, and already came to the conclusion that any amount of extra pre-compilation causes the issue. I donā€™t need anyone to do this work for me because I already investigated it.

The segfault is precisely related to the issues outlined in this thread. Specifically, there are extra pre-compilations happening with the script enabled which only occur in a specific context, especially with Base method extensions for example (which is a main component of the software providing extensions to many methods). Because the precompile cache is for a specific context, it segfaults when used with another package which extends methods also. It might be that they do not actually conflict with each other, but due to the precompiled binary caching (of the initially different extension context), it causes a segfault and a compatibility issue. Specifically, a function was compiled in a certain context ahead of time, and then called in a different context with different Base method extensions.

The solution is to simply not run the extra precompile script, which just means that the first time you run the program it takes a long time to precompile. However, if Julia had better context-specific binary caching of methods which get extended in multiple contexts, I believe this would help improve the same situation of mineā€¦ and this is what this topic was originally about: being able to cache method extensions handled in a context specific way.

I apologize, apparently I am indeed missing some context here. I did not mean to offend.

Would you mind providing details on the specific issues you encountered while troubleshooting this issue? This could well be an issue in Julia that needs addressing (other than the obvious issue of precompile latency), and/or the specific details of your investigation might shed light on what specifically can be done to fix the proposed problem(s) with contexts.

1 Like

it seems the longer binary caching is delayed the harder it becomes.
does Cassete.jl allow injecting binary results to the compiler, if someone wanted to experiment with per-environment/per-session caching?

1 Like

This thread was probably never updated because itā€™s so old, but there is active development towards native code caching in Support external linkage in "sysimages" by timholy Ā· Pull Request #44527 Ā· JuliaLang/julia Ā· GitHub.

1 Like