I keep wondering why the time-to-first-whatever is such a difficult problem. It is currently my main issue with Julia by a fair margin (I do know about Revise and PackageCompiler, but also about their limitations). There is clearly something I don’t understand, and would appreciate some insights.
If I have a REPL session running, with a bunch of compiled methods in memory, why is it not possible to save a complete state of my session, with all its definitions and llvm bytecodes, from memory onto a file? Wouldn’t that be a generic approach to solve the problem? (My wishful dream would be that I could just load a certain work snapshot from a file, with a selection of my favourite packages already loaded and a large set of important methods precompiled, and immediately start to compute and plot stuff without precompilation delays…)
Isn’t that basically what Revise and PackageCompiler already allow for? My understanding is that the same limitations would apply to any approach that saved the compilation state: i.e. you couldn’t redefine types saved in this image (ala Revise) and there may be existing compiler bugs/limitations revealed by this approach (ala PackageCompiler).
Well, Revise doesn’t survive reboots and PackageCompiler doesn’t work with a whole workspace (which can have several non-trivially interacting packages loaded), and is tricky to get to work in general, AFAIU.
So this is an honest question, I really have the feeling there is some core aspect of Julia’s design that makes this snapshot saving problematic, but I don’t know what it is…
I don’t think that’s the right question. It may not be impossible, but it would surely be a lot of work for a very specific purpose, and making compilation faster would have more general benefits.
Understood, thanks @Tamas_Papp. But my point is, making compilation faster (i.e. faster inference + caching more things) is different from making it instantaneous (i.e. caching all things)… The latter would undoubtedly also have its appeal.
In any case: is there any open Github issue (analogous to the PARTR one) tracking ideas/work towards the goal of faster compilation? Or should I just look at the latency tag? I haven’t seen any specific strategy spelled out (it might be happening behind the scenes, of course)
This branch lets one compile (small, simple) programs to native code as an executable, in an AOT manner like PackageCompiler, but uses an approach that is much closer to how CUDAnative works. You could feasibly use this branch, plus some hooks into codegen to record all new method definitions and such, and then compile those to an executable that can be later run to get you back into a state that closely matches the one you used to be in.
Of course, it’s not possible to save things like open network sockets and pointers to unknown C objects.
I had not really succeeded in using PackageCompiler until now, but I now get it! It is indeed a great solution, once you get it to run. I guess if you have no binary deps it is much smoother. I wish I could mark two posts as solutions, Fezzik also looks great.