Per module runtime generated reloadable usrimg.jl

proposal

#1

I want to open a subject for discussion:
Julia is a mixture of an interpreter and jitter(compiler to binary code), some of the code is
jitted into binary form while some is interpreted line by line.

Do you think it is possible to cache the generated binary code into some dll,dylib,so save it to disk and reuse it in the next session?

The idea is to have a binary cache per module (with dependency tracking and invalidation process to enable code changes)

Do you think such an idea is feasible within the Julia framework?


#2

It’s not impossible, but also not quite simple. This would require heavy heuristic pruning because each session generates hundreds of MB of machine code, so the cost of relinking might quickly outweigh the benefits. As a very rough point of comparison: I use a heavily-templated C++ application with ~1GB runtime image, ~600 shared libraries (don’t ask), and 10s of thousands of symbols. Startup time is on the order of one minute, with something like 70% of that time spent simply strcmping symbol names during coalescing.

The keywords to start with if you want to look at prior related work are “incremental linking” and “profile guided optimization”.

I believe the more likely approaches to improved REPL responsiveness and compilation overhead involve moving the frontend to Julia, moving to a bytecode interpreter, and eventually tiered compilation. These are more work than doing just the above caching optimization, but provide a much higher return on investment. Trading some codegen quality for compiler speed is significantly easier than getting good/fast code generation to begin with, and is fairly well understood.


#3

Thanks, a very interesting read.

You are probably correct in the long run, as a remedy until that time I think a minimal feature would be
the ability to save and reload the binary state of a REPL session…
Even with potentially slow load times , it beats in a few order of magnitude the re-compilation re-parsing time.

Think about your 1GB C++ project with 600 shared libraries… think that you had to recompile every time you wanted to run the program… aging slowly in front of a computer screen :slightly_smiling_face: