Adding packages + pre-compilation vs. (docker pull) pre-built sysimage

I made an observation, which might be useful. I have a base julia docker image with jupyterlab installed. I then added another layer that does
RUN julia -e "using Pkg; Pkg.add(["Plots","StatsPlots","DataFrames"]); Pkg.precompile();"

Building that docker layer needed 2 min 54 seconds, which is just what julia needed to perform the tasks. (It’s a prototype, so I didn’t build a sysimage.)

The download times for docker pull for me when there are no images locally are: 58 seconds for the base image and 39 seconds for the delta to get the image with precompiled packages in it.

What this effectively did is that it reduced 2 min 54 seconds to 39 seconds.

Right now this is not yet particularly useful, but it might be made useful I think. For one it might not need to be a docker image if julia could distribute custom sysimages. It would probably also require a limited set of sysimage releases, like latest DataFrames + latest Plots, or whatever is actually needed in practice. And of course a user would need to be able to locate a suitable pre-built sysimage.

I’m not that familiar with (pre-)compilation in julia, so I’m not sure how useful this could be, but trading off work done at user’s laptop to work that was already done and available for download sounds like a good deal.

docker images
REPOSITORY                      TAG                            IMAGE ID       CREATED          SIZE
statisticalmice/julia-jupyter   1.6-buster-datascience-proto   2d3e28505ea3   12 minutes ago   1.89GB
statisticalmice/julia-jupyter   1.6-buster                     acf0e16d0c9e   7 days ago       1.17GB
2 Likes

I’m also struggling with it. The precompilation in my case takes ~10 minutes, which is a lot.
But I don’t have idea how to solve it consistently when occasionally I raise version of my packages, but not quite often, and how could I cache/save the precompiled data when it was precompiled earlier, but it would automatically invalidate in case I raise the version of some package, but it would be great to have some way to do this.