Earlier this week I made a presentation of Julia for data science and ML. One question came up to which I could only give a non-committal answer. Paraphrasing, once our notebooks work fine, how do we deploy at scale on various infrastructure solution.
I did a bit of research and came up with:
Out of the box, without additional packages, julia can be executed on several worker processes on a single machine or several machines connected through ssh.
There is a community focused of running Julia on parallel workloads (https://github.com/JuliaParallel). In particular, they oversee MPI wrappers and another package to deploy on many standard batch systems (e.g. Slurm).
In other posts, many have pointed out that the initial compile time on spinning out a docker image was an issue. Does PackageCompiler address that completely? What about snapshotted VM image ready to go?
In my experience, with a good precompile_execution_file (not always easy to provide), the time needed to spawn a new Julia process and load all packages in the environment is reduced to something like 1s (max). The first run of every function might sometimes still be a bit slower than usual if additional compilation is required, for functions which could not be captured in the system image. But at least that mostly eliminates the latency problem.
I guess everything depends on the use case, and especially the expected run time of the Julia process.
That might very well be a good idea, but I never tried…