Simple way to precompile standard package for batch

I have been Julia quite extensively for over a year now but because I’m me and Julia is Julia, I still consider myself a ‘newbie’, because I still struggle to do very basic things. [I am not complaining about this, just asking patience]

I have seen discussions here about precompiling, images, environments, etc., but there is one (basic,simple?) issue I have all the time and need help with.

For reasons which are not important for here, I need to run files in batch mode. Some are 2 lines long and do very simple things but spend as long as 20 seconds each time just on just loading the package.

For example:

using Plots
plot((1:5).^2)

Is there any simple thing I can do (or tell my students to do) which would enable me to be able to run a script like this without the big overhead of loading ‘Plots’ (or some other package) each time?

Thanks for any help and apologies if the answer is here somewhere (if it is, it obviously wasn’t simple enough for me to follow).

Thanks in advance

I think you’re just asking for PackageCompiler?

Hi Nils,
Possibly. Is there a simple way to use that here?
I’m sorry, but I didn’t get that from the documentation.
Thanks!
[update - I had looked at older documentation - let me try the more recent
one - looks like they have simplified it - sorry!]
Further update - it still seems very complicated to do such a simple thing!!

It sounds like you are looking for a sysimage, so the relevant section is here

1 Like

Thank you. I’ll work with it a bit.

I was hoping there would be a way to just easily cache a compilation of a single package.

So I have to spend about a half an hour each time a package is updated (and for every package) creating sysimages, just so they load quickly when I want to use them? [i.e., there is no other simpler, more convenient way?]

Yes, that’s the correct interpretation I believe - hence why the docs mention that this should only be done for packages where one doesn’t rely on frequent updates.

Just to be sure you don’t have to recompile the systemimage every time a package updates, you only need to do so if you specifically want to update a package that you baked into your sysimage. So if you’re happy with the current functionality in Plots you can just bake it in and only update if and when you either encounter some bug that’s fixed on a newer version or there’s some new functionality that you can’t live without in a newer version.

OK - thanks very much for that. I guess a ‘wish list’ would include a feature of Julia which left packages in the state they last were when last used, unless an update had happened in the interim.

This is a hard problem, there has been loads of discussion on this (search for “time to first plot”, “compile cache” and “method invalidation” on here if you’ve got quite a few hours to spare), and it is (and has been for some time) at or near the top of the priority list of the core developers to improve this.

That said a lot of progress has been made over the past few releases and this trend is likely to continue (you can check against the 1.5 beta or the 1.6 dev version to see whether they improve your situation).

1 Like

What about the analogue of .pyc files? Where, for example, if you would run
myscript.jl then a myscript.jlc would be created which would have all the necessary caching taken care of next time?

I’m not sure it’s super useful to rehash all the discussions around this topic here - I’d really encourage you to search the forum (and maybe also the main Julia repo issues on GitHub) if you’re interested in this topic.

Suffice to say that a lot of thinking and effort has gone into this by people orders of magnitude more capable than me, and I can’t really comment on the validity of approaches in other languages to solving this Julia-specific issue. If you do have an idea though that you think hasn’t been considered in the relevant discussions do raise it of course!

2 Likes

OK - thank you!

1 Like

Have you tried starting Julia with -O1 or even -O0? If the scripts are small with no large loops, that might give you better overall performance.

Thanks! I will give that a try!

I’d encourage you to follow a workflow in which:

  1. the dependencies of your scripts are explicitly managed in an environment (as defined by the Project.toml and Manifest.toml files, and handled by Pkg), and
  2. you put a make.jl script (or build.jl, or however you want to call it) alongside your sources, that automates the creation of a system image using PackageCompiler, and
  3. optionally, everything is version-controlled.

This way, you know when some dependencies have been updated: Manifest.toml reflects the changes, and the version control system can help you know when this happens if you’re not the author of such updates. And you can then simply run make.jl again and be done with it.

In my experience, generating a new system image takes a few minutes (I don’t think I’ve seen any case taking more than 10 minutes, except maybe on Windows systems where it can be a bit longer than on Linux). And you don’t have to wait for it to complete; you can still work as usual while a new system image is compiling in the background: simply start Julia without using any custom sysimage (and pay the potentially high startup/loading times just this one time)


NB: if you tend to use several Julia versions at the same time, I would advise to follow a convention that incorporates the Julia version number inside the system image name. This is because a system image is valid only for the Julia version with which it has been compiled.

1 Like

I think @ffevotte’s ideas are good ones. If what you’re doing is something where you have many scripts that all need to generate a plot or two, or suck down some data and output some small thing… Then it should be possible to have one sysimage that includes all the various Packages needed for all the tasks. You don’t need to build a sysimage for script A and a different sysimage for script B etc.