What is harder about saving a Julia session via the REPL whenever?

I’m learning PackageCompiler.jl, and I’m wondering why creating a sysimage requires an input of packages, the project/environment, and precompile scripts. I do appreciate the paper trail, but couldn’t a sysimage be created from a session on the fly, like calling savesessionto(file) in the REPL?

This thread gave me the impression that such a session snapshot is possible and actually done for base Julia. However, I read issue #22598 from 2017 that “it seems to be technically nearly impossible to do this because of the large amount of program state which cannot be persisted or restored across processes.” Is that still true? If so, how does PackageCompiler manage to do it with the aforementioned stipulations? If not, why isn’t savesessionto standard issue?

4 Likes

I spent a small amount of time a while ago trying to make this work.

Here’s my WIP rebased: add exit_save_sysimage · IanButterworth/julia@454090a · GitHub

One of the issues I came up against, IIUC, was that the output process needs all tasks to be finished. This was crashing during serialization of active tasks.

It might be worth giving it a go to see what it hits.

2 Likes

For instance

% ./julia --startup-file=no -e 'Base.exit_save_sysimage("f.so")'
[ Info: Julia exiting. A sysimage will be generated at "/Users/ian/Documents/GitHub/julia/f.so"
ERROR: Task cannot be serialized
Stacktrace:
 [1] exit
   @ ./initdefs.jl:28 [inlined]
 [2] exit
   @ ./initdefs.jl:29 [inlined]
 [3] exit_save_sysimage(fpath::String)
   @ Base ./output.jl:14
 [4] top-level scope
   @ none:1

Here is a list of sources of global state separate-compilation.md · GitHub

1 Like

I was thinking about this again, and I think it’s helpful to be aware of all the extra stuff PackageCompiler has to do to get a sysimage file PackageCompiler.jl/sysimages_part_1.md at master · JuliaLang/PackageCompiler.jl · GitHub

It seems quite tricky to me.

1 Like

A short answer:
PackageCompiler relies on Julia’s internal serializer to collect all the global states, which means it also has similar limitations, like it can’t save active Task and other complicated states (this is quite reasonable, since it means some threads might be in the middle of execution and there’s no way to serialize in this case) .

Anyway, caching such things (by this I mean all the global states in the session) doesn’t make any sense if your main motivation is to reduce compilation time. I didn’t mean that caching the whole session is useless. Indeed it’s quite useful for getting reproducible results and debugging. And at least on Linux system we have many tools to cache the whole session by directly dumping the whole process memory space and reusing them. It’s just slow, and doesn’t scale well. When you start a different session or add a new library, you need to rebuild everything and it becomes slow again. So this is just not the right way to go, only a temporary solution.

In summary, it’s impossible and nonsensical to cache all the states of Julia REPL sessions and reuse it in next session.

I assume you intended to make a distinction between the merits of PackageCompiler and saving a Julia session, but it’s not clear what because this reads like “caching global states” generally doesn’t save time.
It does make sense to me at least that saving can’t happen with active Tasks, as mentioned earlier in the thread. I also wouldn’t expect saving the session to be possible in a middle of a method call, and even if it were, it would be strange to complete a method call right after loading the session.

It can save time but it’s limited use. The chance you can reuse the session is less than you may expected. Many people quits REPL and reenters into it because they want to redefine a struct or they install new packages, which means that the old global state is wrong and you should not reuse it. Now caching everything takes time. If you can’t reuse this cache enough times then the benefits you get from cache cannot compensate the time you spend on making cache. So no, caching everything is not the right way to go.

Of course for those people who are not package developers they can benefit from this type of caching, because they seldomly change their package environment not redefine struct. But this is too restricted, besides the technical difficulty of implementation.

It’s unknown to the user whether the session is in the middle of a method call. Because Julia is multi-threaded and there might be Tasks running at the backgrounds, let alone locks, opening sockets, libuv handlers or pending signals. What’s more, “saving the session to be possible in a middle of a method call” is exactly what may happens if you want to expose this feature to the user. Because users may call this function (let’s name it checkpoint()) in REPL session when he thinks his work is done and he’s ready to place a checkpoint to save his work, and the running REPL itself is in a middle of method call (the function run_repl). Now you see how complicated Julia is. Julia’s runtime, Base library and compiler maintain many states that are hard to serialized, and it’s meaningless to serialize them.

Given this, cache cannot be easily done for these runtime. So user must be careful to not use these kinds of constructions in order to be able to produce a cache (or you can serialize them and raise fatal error when user tries to use it), which imposes another restriction on this method. In summary, caching global state is time consuming, difficult and non-beneficial.

1 Like

Oh I see, you mean a save state is not useful when made obsolete by frequent changes. Regarding this limitation, there’s no distinction between PackageCompiler and saving a session.

Yes I was mainly thinking about users who would run the same code over the course of days and weeks rather than writing new code. They and their laptop batteries would benefit from a save state.

So the complicating factor in saving an active session, compared to PackageCompiler with precompile scripts, is really just all the active code. Is it really not feasible to just save the parts I want (definitions, globals, and compiled methods) while ignoring or pausing everything else? For example, I wouldn’t need the call stack or Task queue to be stored in a sysimage. The garbage collector pauses everything, couldn’t such a pause be used to save things with less chaos? Maybe I don’t get it because I don’t really know what PackageCompiler does and does not do.

To achieve this, you can cache compiled machine codes only instead of all the states. But this is basically what PackageCompiler does.

It’s feasible to do this as long as you don’t do this for Main module. Main module is unstable and you should not cache it (definitions may change between sessions). But you have mentioned that you don’t want to develop new codes, so this is not a limitation.

So I guess what you want is like this:

  1. Execute some Julia codes in REPL, which is some function calls to other libraries, for example, Plots.plot.
  2. The compiled binary codes and other necessary serializable runtime metadata are cached.
  3. The next time you open REPL, you re-execute you script to set up those runtime you don’t want to cache, and then these compiled codes are loaded. You don’t need to recompile them, so a lot of time is saved.

If this is what you want, then it can be achieved (as long as you don’t save the global and definitions, you can always reexecute them for they take less time compared to compilation). I have a private fork of Julia’s compiler for this kind of binary cache, which utilizes LLVM’s new linking architecture. It indeed works much like what you just said in this thread. And it has all the limitations I mentioned above, it just works as long as you don’t do strange things.

2 Likes

(entering the discussion without anything to say about it)

My favorite type of working code :wink:

(leaving the discussion now)