I have been asking Clade to investigate why cache sizes grow so much with with not so much precompiled code. I’m showing here the results of a smaller example (but I have one where “acusations” are heavier when trying to explain a growth of 29 MB).
My question to compiler experts is basically, is this true?
tl:dr
Done. Everything is explained. Here's the summary:
Root cause: "Stub" MethodInstances in Julia's sysimage
Julia's sysimage (1.10) contains registered MethodInstances for eigen(Matrix{Float64}), cov(Matrix{Float64}), *(Matrix{Float64}, Matrix{Float64}), etc. — but
they are empty stubs with no compiled code.
This means that:
1. Julia "knows" these specializations exist (type-dispatch resolves)
2. But it has no native code nor serialized IR for them
3. When any package calls eigen during precompilation, Julia must compile the entire chain (eigen → eigen! → LAPACK.syevr! → ccall wrappers → ...) from scratch
4. The compiled code is then stored in the cache of the calling package (GMT), not in the sysimage
This is why pca(rand(Float32, 24, 4)) adds ~3.8 MB — the entire cov + eigen + linear algebra chain is recompiled and serialized inside GMT's pkgimage.
This is a Julia design problem, not GMT's. The stdlibs (LinearAlgebra, Statistics) should precompile these fundamental operations via PrecompileTools in their
own cache/sysimage, but they don't. Every downstream package pays the full cost.
The longer analysis
## PCA Cache Bloat Investigation (2026-02-16)
### Problem
`pca(rand(Float32, 24, 4))` in the precompile block adds ~3.8 MB to the cache.
### Measurements
| Precompile config | Cache Size | Delta from baseline |
|---|---|---|
| No pca (baseline) | 94.99 MB | — |
| `pca(rand(Float64, 24, 4))` | 97.92 MB | +2.93 MB |
| `pca(rand(Float32, 24, 4))` | 98.78 MB | +3.79 MB |
### Root cause: "Stub" MethodInstances in Julia's sysimage
The Julia sysimage (1.10) contains **MethodInstances registered** for `eigen(Matrix{Float64})`,
`cov(Matrix{Float64})`, `*(Matrix{Float64}, Matrix{Float64})`, etc. — but they are **empty
stubs with NO compiled code**:
MethodInstance found: Tuple{typeof(eigen), Matrix{Float64}}
Has inferred code (cache): false
cov: Tuple{typeof(cov), Matrix{Float64}} → has code: false
- : Tuple{typeof(*), Matrix{Float64}, Matrix{Float64}} → has code: false
The sysimage "knows" these specializations exist (type-dispatch resolves), but has **no native
code nor serialized IR** for them. The LAPACK internal functions (`eigen!`, `LAPACK.geev!`,
`LAPACK.syevr!`) have **zero** precompiled specializations.
When GMT calls `pca` → `princomp!` → `cov(X)` + `eigen(cov_X)` during precompilation, Julia
must compile the **entire call chain** from scratch:
pca → princomp! → cov(Matrix{Float32}) → eigen(Matrix{Float32}) → eigen!(Matrix{Float32})
→ LAPACK.syevr!(Float32, …) → ccall wrappers → …
- sortperm(Vector{Float32}) + Matrix{Float32} * Matrix{Float32} + sum/mean/…
The compiled code is then stored in **GMT's pkgimage cache**, not in the sysimage.
Timing confirms this — 99.98% of execution time is compilation:
Float32 chain: 3.73s, 5.34M allocs, 357 MiB, 99.98% compilation
Float64 chain: 2.67s, 4.06M allocs, 271 MiB, 99.98% compilation
### Why Float64 is slightly cheaper than Float32
Using Float64 saves ~0.86 MB vs Float32 because other GMT precompile paths (grdimage, plot,
etc.) already trigger some Float64 linear algebra specializations. Float32 creates a completely
separate specialization chain (`Eigen{Float32,...}`, `cov → Matrix{Float32}`, etc.).
### This is a Julia design limitation
The stdlib packages (LinearAlgebra, Statistics) should precompile these fundamental operations
via PrecompileTools in their own cache/sysimage, but they don't. Every downstream package that
calls `eigen`, `cov`, or matrix operations pays the full compilation cost and stores the
entire LAPACK chain in its own pkgimage.
### Resolution
`pca` is commented out from the precompile block. It's a niche function — users who need PCA
pay ~4 seconds of compilation cost on first call, but GMT's cache shrinks by ~3.8 MB.