# Overview
This aims to establish the foundation needed to save native code iā¦n "package images." From a technical standpoint, all it does is replace our current mix of serializers (`dump.c` for `*.ji` files, `staticdata.c` for system images) with a unified serializer based on `staticdata.c`. Necessary functionality like uniquing types and MethodInstances, supporting external CodeInstances and new method roots, and load-time invalidations are now supported by `staticdata.c`. A key feature is the ability to link externally: the serialization format defines a tag for an external object, which is linked after loading by pointer relocation.
The core system image and all "package images" are stored as contiguous blobs in memory, each identified by a pair of `*begin, *end` pointers stored in `jl_linkage_blobs`. Each of these images is identified by the module `build_id`, and this value is used in encoding external references. For individual objects, "ownership" is decided by pointer address, i.e., which pointer pair encloses the given object. Some objects, like *external* MethodInstances (new specializations of callables owned by other packages), are deliberated created with pointer addresses that fall out of these ranges to ensure that they go through the uniquing pipeline.
Compared to the original version of this PR, we've temporarily stripped all the work devoted to native code support. That will be restored in a future PR expected to land shortly after this merges. The goal here is to transition to one-serializer-to-rule-them-all without breaking Julia. It completes the goal of the original PR and items 2 & 4 listed in the "Future work (TODOs)". Item 3 is already merged to master, so overall it has become far more ambitious than its original scope despite having taken a step backwards with respect to native code.
This has become a 3-way collaboration (@vchuravy, @vtjnash, and @timholy), with the recent participation of @vtjnash having added enormously to our progress.
For reference, the original version of this post is included below, but note that several points no longer apply.
-------------------
# Overview (original)
This pull request is the first fruit of a tight collaboration between @vchuravy and @timholy. Our hope is that this is the inaugural PR in a series whose ultimate goal is to allow packages to save & reuse their precompiled native code. A second outcome might be to enable (or contribute to enabling) StaticCompiler.jl to be implemented with few external dependencies.
The journey is long, and this first pull request is intended to have no user-visible consequences (neither good nor bad). But we believe it establishes many of the necessary fundamentals.
# Background
Caching code requires [serialization](https://en.wikipedia.org/wiki/Serialization) and deserialization. Julia has several code (de)serializers, but here our focus is on two, the one in `dump.c` and the one in `staticdata.c`. `dump.c` writes the `.ji` files that we currently use for packages and an intermediate stage of building Julia, whereas `staticdata.c` creates the object files (`.so` on Linux) that serve as Julia's system image. `dump.c` can (now, after #43990) save almost all supportable objects *except* native code. Conversely, `staticdata.c` can save native code and has a more streamlined design, but currently is only useful for writing monolithic system images. Part of the ultimate goal of the series of PRs is to blend the best of both (de)serializers together.
In addition, substantial changes will be required in Julia's codegen/LLVM infrastructure. The core issue is that caching native code across multiple files is a lot like building a C application from a bunch of separate `.o` files: you need a linking step to get them to work together. Currently, Julia has no real mechanisms to perform this linking. Interaction among packages has to persist after the LLVM modules used to assemble them have been discarded.
# Details
The only way to engage the functionality here is to launch Julia with command-line arguments
```
--output-o $output --output-incremental=yes
```
which is not a supported combination on `master`. Hence this will *not* be used in package precompilation until we switch `base/loading.jl` to issue this combination of command-line arguments. Consequently, we can develop the required infrastructure, get all the pieces working, and then turn it on for regular usage.
This first PR aims to allow system-image-like blobs ("package images"?) to encode *external references* and perform most of the necessary linking to connect them. It consists of a new "tagged" serialization enum, `ExternalLinkage`, used to encode these links. It also includes much of the LLVM functionality needed for successful linkage, with one major exception described below.
Fundamentally, cross-references among Julia internal structures are made by pointers. External linkage is therefore achieved by pointer relocation. During serialization, package contents are copied into a single contiguous blob of memory. To make pointers relocatable, an external reference is decomposed into two pieces:
- to encode "which blob are we linking against?", we use the `build_id` of the toplevel module in the precompilation `worklist`
- within a blob, identity is determined by the offset from the blob's base pointer.
An introduction to details of the (de)serialization mechanisms used in `staticdata.c` can be found in the extensive comment at the top of the file.
## What this does
This successfully serializes and deserializes external links, and implements much of the functionality needed for "partial" LLVM modules. In particular, we support:
- saving lowered, type-inferred, and native code for methods defined in the package
- accessing global variables from the same package, even from compiled code
- calling compiled functions in other package images (partial, see below)
- accessing global variables defined in other package images, even from compiled code
It also introduces a "stub" implementation of new standard library, `LLD_jll`, used for performing some of the linkage.
## What this doesn't do
Currently, the decision about whether a reference is internal vs external is deliberately over-simplified, and arrives at the wrong answer in important cases, such as when PkgB triggers novel specialization of a method in PkgA. Because of some challenges involving exported names and/or the need for a [trampoline](https://en.wikipedia.org/wiki/Trampoline_%28computing%29), it also duplicates native code in the `.text` section of the object file (written by LLVM) rather than linking to a unique implementation.
## Future work (TODOs)
We expect this PR to be followed by at least four more PRs:
1. one that implements de-duplication of the native code (likely via implementation of a trampoline)
2. one that expands/migrates functionality from `dump.c` to `staticdata.c` (adding methods to external functions, compiling novel specializations of external methods, uniquing compilations of the same `MethodInstance` by multiple downstream packages, managing backedges and invalidation, etc.)
3. one that makes `LLD_jll` a "real" standard library
4. one that makes this the default (or only) mechanism for precompiling packages
We welcome participation by others in these future developments.
## Future prospects
If all this works as well as we hope, we expect to see dramatic decreases in latency for precompiled workloads. Indeed, in favorable cases with little or no invalidation, compilation time may be nearly eliminated.
In such cases, the majority of Julia's remaining latency problem will be due to package loading. We do not expect this sequence of PRs to make load times worse, but despite efficiencies in the `staticdata.c` representation we also don't expect there to be much improvement: raw deserialization is likely to become faster, but on current master it is already dominated by the cost of method insertion and invalidation, and that won't change in this sequence of PRs. In the future, there are a number of possible ways to improve load times (perhaps dramatically), but we plan to get this whole series merged first before even beginning to contemplate tackling load times.
## Ideal schedule
We're well aware that important work remains to finalize Julia 1.8, and that work should take priority. However, once that ramps down it would be great to get this reviewed and merged fairly early in the 1.9 cycle. There's a long ways yet to go, and we'll need time if we are to get the entire sequence merged for 1.9.