Automatically build a system image on package installation

For colleagues used to a workflow centered around running Python scripts from the command line, I was wondering if a similar workflow with Julia could be facilitated by automatically building a system image containing relevant packages when the Julia package is installed. Has anyone tried this before? Is it a bad idea?

I was reading in the docs for the pyjulia interface that it supports using a Julia system image, but provides instructions only for how to manually build one.
https://pyjulia.readthedocs.io/en/latest/sysimage.html

2 Likes

I regularly compile a bunch of packages. As long as it doesn’t compile for every new pacakge but does it bachted then performance won’t be that bad.

I was more thinking of providing a Julia package which my colleagues could use that would automatically build the needed system image so that it’s snappy to use when they call it from Python, possibly from the command line. The system image build would thus be done when this particular package was installed

A few topics that seem related:

https://github.com/JuliaLang/julia/pull/35794

https://github.com/JuliaLang/julia/issues/32906

2 Likes

I think that’s an interesting idea. But if you want to provide something that works fully automatically (for example by replacing Julia’s default sysimage), you’ll probably have to think carefully about what you want to happen when one of the packages baked into the sysimage gets updated. The situation you’d want to avoid is one where your colleagues would update the packages they use but, because they keep using an out-of-date sysimage, would not actually get the behavior of updated packages.

One thing I think could work, would be to install in their PATH a small script called julia and acting as a proxy to the real julia binary. The script would check the julia version, the versions of all relevant packages, and see whether it finds a suitable sysimage. If yes, it runs julia with the sysimage. If not, it warns the user and/or proposes to create a sysimage.

We’re currently trying to use a similar mechanism to provide sysimages for LanguageServer to be used by eglot-jl. The way we are currently experimenting for doing this is by using a naming convention for the sysimage, that includes the julia version as well as a hash of the Manifest contents. If you want to look at the details, the relevant PR is here:

https://github.com/non-Jedi/eglot-jl/pull/12

VScode people probably have much more experience about this. They seem to tackle the same problem by comparing the last modified dates of the sysimage and the Manifest.

And I guess I should also say that all of these techniques can fail in some way or another if you don’t have enough control on the environment that users want to use (stacked environments might cause issues, for example).


While I may be implying that the process is not entirely straightforward, I wouldn’t want to deter you from trying to implement this. And I’d very much like to hear about your progress!

1 Like

I just want to mention that there are users that would never themselves run julia. I often write tools for people with zero programming knowledge. These people do not know or need to know how to even start the REPL. They just want a “program”. It would be amazing to have the option to wrap an application (i.e. a package with a Manifest file that basically exports just one main function) with some machinery that locally produces an executable that was compiled for and on that local system. Boom.

So yea, we would need some way for them to install that package, and some conventions on where to place that executable, but it would mean that if I made a tool (that passed enough testing, vetting, etc, i.e. I don’t want to update it every week) I’d be able to get the users to “install” (which means internally, Pkg.add, build, BB the app, with the usage file, and link the binary to some shortcut icon on their desktop or some such) it and use it with maximum speeds.

But as I said, it would be mostly useful for people that don’t need to start the REPL.

4 Likes

I agree, but realistically, only the package I am providing would ever be updated by the user, thus triggering the recompilation.

The user would in this case not be concerned with julia environments, but rather load and use the system image from python. As soon as the user actually wants to use julia and not only call some function from python, I think a manual process is worth the effort.

I’ll be sure to report back if I manage to get something usable out of this :slight_smile: Thanks for all consideration!

Cool! Then I think using system images is a very good idea. You’re planning to trigger the creation of the sysimage in the build step of your package’s installation?

Thanks!

We already have that option, no? Couldn’t you ship you application with a Julia script that:

  • instantiates the environment,
  • calls PackageCompiler to create a sysimage or an app,
  • optionally install a desktop shortcut icon linking to the app (or a script running it)?

Provided that Julia is correctly installed on the users’ systems, installing the app should be as simple for them as downloading an archive, uncompressing it and double-clicking on the install script.

On your side, writing a script as described above should not be almost no work.

FWIW I’ve already used this kind of strategy a couple times, but for users who seem to be more tech-savvy than yours (and on Linux). I would ship a Makefile alongside the sources of the app, and they would clone the git repo and run make or make install to build a sysimage and install in their PATH a script running the app.

1 Like

If you are using Julia via pyjulia, you’d need to create sysimage after pyjulia is installed (as it has to pull off some tedious magics). Unfortunately, I don’t think Python’s package installer (pip) has an appropriate post-install hook mechanism to make it work. If your colleagues use conda, it’s probably possible to do something via post-link hook: Adding pre-link, post-link, and pre-unlink scripts — conda-build 3.22.0+10.g432a9e1b.dirty documentation. Having said that, I think it’s better to do this in a Python script/package to compile the sysimage on the first execution time.

Good points. I’ll most def look into this (remember I first need to have one of my tools reach this magical realm of doesn’t-need-to-be-updated-every-week).

1 Like

New PackageCompiler.jl has an option to make an “app” where the user doesn’t even need Julia installed. On mobile, so can’t easily find the link, but check out Kristoffer’s JuliaCon presentation from this year.

One thing I’m not clear on is whether the app can take arguments. But regardless, it looks pretty neat.

It can, they get passed to the julia main function.

2 Likes

While I don’t dispute the use case for standalone, self-contained, mostly precompiled “executables”, I think that in 2020 most of the people you are talking about above really want a web service, maybe on the local intranet.

Agreed!

Having these apps work from a browser is better in many respects, but creating executable apps is often simpler, depending on the task. As mentioned above, I’d argue that it’s pretty straight forward to create an executable on the user’s local machine. Maybe soon, what with JSServe etc, it might be equally easy to build a webapp.

1 Like

I tend to agree on that point, especially when all user inputs can be nicely presented in a web UI and the output can be nicely presented in the UI, or downloaded as a data file to be further analyzed by another program.

However I’m under the impression that there are things that are difficult to do with web services. I’m thinking in particular about I/O from/to multiple files on the user’s computer, as would be needed by a more complex program that looks for data/parameters in multiple input files, or produces several different and complex outputs.

(But I don’t really know anything about such topics, so I might be wrong. And I guess a lot of use cases are covered by a simple process where the user uploads one input file, the “program” runs remotely, and the user subsequently downloads a results file)

1 Like

This package does something like what you are asking for

https://github.com/jlapeyre/julia_project

I already posted about it. But, I think it’s worth posting here because it addresses the issue.

julia_project is a Python package. You use it inside another Python package and module to manage the Julia dependency. There is a method compile_julia_project() that you call to make a system image that will be found the next time you import the Python module. My operating assumption is that people love Python and feel comfortable with it, so asking them to call a Python method is a very small ask. On the other hand Julia is unknown to them, and they don’t have time to be exposed to anything new. However, if your users don’t even like to be told to call a Python method , you might automate it. I did not add a keyword option to automatically compile on installation. But, that would be an easy modification.
It works like this in your Python module code:

import os
mymodule_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))

julia_project = JuliaProject(
    name="mymodule",
    package_path=mymodule_path,
    registry_url = "git@github.com:myuser/MyModuleRegistry.git",
    logging_level = logging.INFO # or WARN, or ERROR
    )

julia_project.run() # This exectutes all the management features.

def compile_mymodule():
    julia_project.compile_julia_project()
1 Like