Is it possible to preserve compilation cache from julia process to julia process?

Hi there,

my aim is to speed up my julia development environment. I am already using __precompile__(false) for my developed package and Revise as much as possible, but if the julia process restarts, it really takes long for the package to load initially. Like too long.

Is there a possibility to preserve the cache from one julia process to use it in a next one? I mean all these jit compilations are already done, it would be so cool if they could simply be reused.

Is there a command line argument for this or some other way?

2 Likes

I think one low hanging fruit is julia -O0:

╭─fxw at earth in ⌁
╰─λ julia -O0                                                                                                                                                                              0 (01:09.046) < 18:23:55
At startup Revise.jl and OhMyREPL.jl loaded
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time @eval (exp(randn((10,10))));
  1.378665 seconds (4.77 M allocations: 315.244 MiB, 5.52% gc time, 99.97% compilation time)

julia> 
╭─fxw at earth in ⌁
╰─λ julia -O2                                                                                                                                                                                 0 (3.653s) < 18:24:00
At startup Revise.jl and OhMyREPL.jl loaded
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time @eval (exp(randn((10,10))));
  3.788491 seconds (4.77 M allocations: 315.242 MiB, 2.10% gc time, 99.98% compilation time)

julia> 

1 Like

Check out PrecompileTools: Home · PrecompileTools.jl.

thank you - this is awesome! I appreciate to trade optimal performance for quicker startup time

1 Like

PrecompileTools require __precompile__(true) in order to work, and makes precompilation even slower (because of the example case which needs to run for tracking precompile statements). Not ideal for development

1 Like

Fundamentally the challenge here is cache-invalidation and knowing that a cache file can be reused.

In Julia we track the validity across sessions and the package boundary. So when we load a package image we check that all the dependencies are the same as before and that all the source files it depends on are the same.

Now within a session we also track validity but at a method level, but instead of it being content-based we have to track the global state of the method table (e.g. the world-age mechanism).

So the answer currently is no. The only way to save state across sessions is to use the system/package image mechanism (Which is what __precompile__(false) opts out off).

I think most people are using the normal precompilation mechanism with Revise to take advantage of the cache…

Could we eventually have a more fine-grained cache? Perhaps, but it is not an easy problem.

2 Likes

Thank you for the details on the existing caching mechanism.

It sounds to me that it might be easier to hope that Revise could eventually also revise struct definitions, constants, etc… that everything is revisable.

Then you really just need to have a constantly open julia session (but even then you may want to shutdown your PC without a huge penalty…)

How about revising struct definitions Pluto style by taking adantage of the contextual REPL?

julia> module Main0001 end
Main.Main0001

(Main.Main0001) julia> struct Foo end

(Main.Main0001) julia> foo(::Foo) = println("Hello World")
foo (generic function with 1 method)

(Main.Main0001) julia> foo(Foo())
Hello World

julia> module Main0002 end
Main.Main0002

(Main.Main0002) julia> struct Foo
                           greeting::String
                       end

(Main.Main0002) julia> foo(f::Foo) = println(f.greeting)
foo (generic function with 1 method)

(Main.Main0002) julia> foo(Foo("Guten morgen"))
Guten morgen
1 Like

Could someone make a julia process load a provided cache? Sure, that’s just a sysimage. Could there be a useful way to interactively save a julia process for that cache? Probably not.

First consider the environment. Say you install a package, compile some of its methods (some of which may be inlined in your own methods), then cache the process for next time. Then you decide to update the package and its methods change. Do you throw the cache out completely or just the affected parts? Could the cache store information that allow you to do the latter without compromising performance? Could you update the cache automatically along with the environment? Would such updates be accessible in the current process or do you have to start another one?

Another problem is scoping. For a small example, a mathematical constant is available as the imported Base variable pi, but you could choose to shadow it with pi=80 in a global scope such as Main and work from there. You save that cache and send it to colleague, who starts wondering why all their subsequent trigonometric methods give the wrong numbers. The simple sane solution is to isolate that cache to its own global scope, not affect Main or any module the user makes, and document all the source code that resulted in that cache.

These are exactly what packages and precompilation do, and loosely speaking you see a similar pattern in AOT-compiled languages for reusing compiled code. Interactively saving sessions isn’t a trivial feature, and the languages that have it made many rational tradeoffs that we likely won’t want and can’t in v1.

I recall a PR or issue somewhere discussing an invalidation-like mechanism for constants, so you don’t have to resort to reassigning non-constant globals and settle for suboptimal code. However, structs and other instantiable types have an extra layer of difficulties, such as existing instances of the obsolete type. Automatic one-to-one replacement with instances of the new type isn’t always feasible, and even when it is, it can cause latency and nasty side effects that method redefinitions don’t. This really is a novel problem, the typical way to implement dynamically replaceable types is non-constancy and suboptimal code. We can already do that, we just don’t.

1 Like

My takeaway from one such thread was that the way to get modifiable and revisable compile-time constants is to use functions instead. Replace

const foo = 1.0

with

foo() = 1.0

If you want you can change back when development has stabilized, but I don’t know if there’s really a good reason to.

1 Like