Is it possible to preserve compilation cache from julia process to julia process?

schlichtanders · April 25, 2024, 8:43am

Hi there,

my aim is to speed up my julia development environment. I am already using __precompile__(false) for my developed package and Revise as much as possible, but if the julia process restarts, it really takes long for the package to load initially. Like too long.

Is there a possibility to preserve the cache from one julia process to use it in a next one? I mean all these jit compilations are already done, it would be so cool if they could simply be reused.

Is there a command line argument for this or some other way?

roflmaostc · April 25, 2024, 4:23pm

I think one low hanging fruit is julia -O0:

╭─fxw at earth in ⌁
╰─λ julia -O0                                                                                                                                                                              0 (01:09.046) < 18:23:55
At startup Revise.jl and OhMyREPL.jl loaded
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time @eval (exp(randn((10,10))));
  1.378665 seconds (4.77 M allocations: 315.244 MiB, 5.52% gc time, 99.97% compilation time)

julia> 
╭─fxw at earth in ⌁
╰─λ julia -O2                                                                                                                                                                                 0 (3.653s) < 18:24:00
At startup Revise.jl and OhMyREPL.jl loaded
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time @eval (exp(randn((10,10))));
  3.788491 seconds (4.77 M allocations: 315.242 MiB, 2.10% gc time, 99.98% compilation time)

julia>

danielwe · April 25, 2024, 4:28pm

Check out PrecompileTools: Home · PrecompileTools.jl.

schlichtanders · April 26, 2024, 11:50am

thank you - this is awesome! I appreciate to trade optimal performance for quicker startup time

schlichtanders · April 26, 2024, 11:52am

PrecompileTools require __precompile__(true) in order to work, and makes precompilation even slower (because of the example case which needs to run for tracking precompile statements). Not ideal for development

vchuravy · April 26, 2024, 12:20pm

Fundamentally the challenge here is cache-invalidation and knowing that a cache file can be reused.

In Julia we track the validity across sessions and the package boundary. So when we load a package image we check that all the dependencies are the same as before and that all the source files it depends on are the same.

Now within a session we also track validity but at a method level, but instead of it being content-based we have to track the global state of the method table (e.g. the world-age mechanism).

So the answer currently is no. The only way to save state across sessions is to use the system/package image mechanism (Which is what __precompile__(false) opts out off).

I think most people are using the normal precompilation mechanism with Revise to take advantage of the cache…

Could we eventually have a more fine-grained cache? Perhaps, but it is not an easy problem.

schlichtanders · April 26, 2024, 12:41pm

Thank you for the details on the existing caching mechanism.

It sounds to me that it might be easier to hope that Revise could eventually also revise struct definitions, constants, etc… that everything is revisable.

Then you really just need to have a constantly open julia session (but even then you may want to shutdown your PC without a huge penalty…)

mkitti · April 26, 2024, 3:27pm

How about revising struct definitions Pluto style by taking adantage of the contextual REPL?

julia> module Main0001 end
Main.Main0001

(Main.Main0001) julia> struct Foo end

(Main.Main0001) julia> foo(::Foo) = println("Hello World")
foo (generic function with 1 method)

(Main.Main0001) julia> foo(Foo())
Hello World

julia> module Main0002 end
Main.Main0002

(Main.Main0002) julia> struct Foo
                           greeting::String
                       end

(Main.Main0002) julia> foo(f::Foo) = println(f.greeting)
foo (generic function with 1 method)

(Main.Main0002) julia> foo(Foo("Guten morgen"))
Guten morgen

Benny · April 27, 2024, 12:24am

Could someone make a julia process load a provided cache? Sure, that’s just a sysimage. Could there be a useful way to interactively save a julia process for that cache? Probably not.

First consider the environment. Say you install a package, compile some of its methods (some of which may be inlined in your own methods), then cache the process for next time. Then you decide to update the package and its methods change. Do you throw the cache out completely or just the affected parts? Could the cache store information that allow you to do the latter without compromising performance? Could you update the cache automatically along with the environment? Would such updates be accessible in the current process or do you have to start another one?

Another problem is scoping. For a small example, a mathematical constant is available as the imported Base variable pi, but you could choose to shadow it with pi=80 in a global scope such as Main and work from there. You save that cache and send it to colleague, who starts wondering why all their subsequent trigonometric methods give the wrong numbers. The simple sane solution is to isolate that cache to its own global scope, not affect Main or any module the user makes, and document all the source code that resulted in that cache.

These are exactly what packages and precompilation do, and loosely speaking you see a similar pattern in AOT-compiled languages for reusing compiled code. Interactively saving sessions isn’t a trivial feature, and the languages that have it made many rational tradeoffs that we likely won’t want and can’t in v1.

I recall a PR or issue somewhere discussing an invalidation-like mechanism for constants, so you don’t have to resort to reassigning non-constant globals and settle for suboptimal code. However, structs and other instantiable types have an extra layer of difficulties, such as existing instances of the obsolete type. Automatic one-to-one replacement with instances of the new type isn’t always feasible, and even when it is, it can cause latency and nasty side effects that method redefinitions don’t. This really is a novel problem, the typical way to implement dynamically replaceable types is non-constancy and suboptimal code. We can already do that, we just don’t.

danielwe · April 27, 2024, 1:47am

My takeaway from one such thread was that the way to get modifiable and revisable compile-time constants is to use functions instead. Replace

const foo = 1.0

with

foo() = 1.0

If you want you can change back when development has stabilized, but I don’t know if there’s really a good reason to.

Topic		Replies	Views
How to reuse julia compiled cache across machines? Performance question , pkg	8	286	August 14, 2024
Caching precompiled function New to Julia	7	910	January 14, 2021
Separate compile cache for `julia-debug`? General Usage	7	429	October 24, 2019
What is being cached in the stale compilation cache? Internals & Design	1	429	November 1, 2020
Recompiling stale cache file. How to minimize that? New to Julia	4	2049	November 8, 2018

Is it possible to preserve compilation cache from julia process to julia process?

Related topics