Hi,
I just want to share my experience building a command-line tool in Julia.
A few years ago, I developed a small (unregistered) package, FITSexplore, to manipulate FITS files and their headers/keywords.
Recently, I decided to turn it into a command-line tool to filter files according to their keywords.
App
First, I used the fairly new [apps] section in Project.toml. This is very convenient: just type
app add https://github.com/FerreolS/FITSexplore.jl, add ~/.julia/bin to your PATH, and it works out of the box.
I think this app feature is a very good idea from a user perspective, as it is very simple to install. The drawback is that the user has to pay the cost of launching Julia + TTFX at each call.
Initially, a standard call extracting specific keywords from a folder containing a few dozen files took ~10 seconds.
After chasing dispatch issues (not so simple when keyword values can have multiple types) with JET/SnoopCompile, each call dropped to ~2.5 seconds.
Precompilation
Using PrecompileTools and adding the right calls in @setup_workload, each call dropped further to ~2 seconds.
Sysimage
As I still found this too slow, I tried building a sysimage:
using PackageCompiler
create_sysimage(["FITSexplore"]; sysimage_path="fitsexplore_precompile.so")
Adding -J fitsexplore_precompile.so, the call takes a bit less than 1 second.
The issue is that, from the user’s point of view, this loses the simplicity of the app add installation command. The user now has to build the sysimage and add -J fitsexplore_precompile.so manually. Maybe there is a way to automate this during installation.
Another drawback is that the sysimage size is >200 MB.
JuliaC --trim=safe
To get an even faster startup with a smaller binary, I tried building a relocatable binary with juliac --trim=safe. At first, it was a nightmare, with many dispatch issues, but with the help of JET and Copilot I eventually succeeded.
I had to remove dependencies like ArgParse and StatBase and rewrite a few parts. For example, there were many errors related to show method calls, and Copilot suggested replacing them with:
@inline function emit_stdout(msg::String)
GC.@preserve msg begin
ccall(
:write, Cssize_t, (Cint, Ptr{UInt8}, Csize_t),
1, pointer(msg), ncodeunits(msg)
)
end
return nothing
end
I’m not sure whether this is the best way to print output, but I didn’t find a more idiomatic Julia solution.
All the cleanup and refactoring made the code significantly faster, even for the app version. The timings are now and we can see that launching julia takes most of the time (~1s):
| –help | more complex call | |
|---|---|---|
| App | 1.03 s | 1.04 s |
| Sysimage | 600 ms | 640 ms |
| juliac --trim | 53 ms | 106 ms |
The binary bundle weighs 92 MB.
It is much faster, with no TTFX anymore, but from a user perspective it is no longer a no-brainer install like app add .... To make things easier, I added a small Makefile to build the bundle locally and move it to a new .julia/bundles/fitsexplore-bundle folder, with a symlink of the executable in .julia/bin/fitsexplore.
This allows users to get the CLI either with app add ... or via the Makefile, as long as .julia/bin is in their PATH.
Conclusion
In conclusion, I find that the Pkg app feature is very handy, but it suffers from loading time and TTFX delays. These issues can be (at least partly) solved by building a sysimage or a binary with juliac, but this requires rewriting methods, which is not always possible depending on dependencies.
It also breaks the simplicity of app add installation. In my opinion, these issues are showstoppers for using Julia as a CLI scripting tool.
Would be interested to hear if there:
- Are there better patterns to distribute fast Julia CLIs?
- Any more idiomatic way to handle printing in a
juliac --trimcontext? - Is there ongoing work to improve this workflow (e.g. automating sysimage builds for apps)?