Tool: sysimage_creator for IJulia users

Tool: sysimage_creator for IJulia users

Announcement

  • Hi, I’m happy to annaunce the release of sysimage_creator. It will reduce the time to initialize Jupyter Kernel and plot for the first time. Have a try.
  • Below I will explain why I made this.

First motivation

  • Have you thought running using Plots (first plot) is slow?

    % julia
                   _
       _       _ _(_)_     |  Documentation: https://docs.julialang.org
      (_)     | (_) (_)    |
       _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
      | | | | | | |/ _` |  |
      | | |_| | | | (_| |  |  Version 1.6.1 (2021-04-23)
     _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
    |__/                   |
    
    julia> versioninfo()
    Julia Version 1.6.1
    Commit 6aaedecc44 (2021-04-23 05:59 UTC)
    Platform Info:
      OS: macOS (x86_64-apple-darwin18.7.0)
      CPU: Intel(R) Core(TM) i5-8210Y CPU @ 1.60GHz
      WORD_SIZE: 64
      LIBM: libopenlibm
      LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
    julia> @time using Plots
      4.117976 seconds (6.74 M allocations: 490.050 MiB, 4.78% gc time, 0.12% compilation time)
    
    julia> @time (p = plot(rand(5), rand(5)); display(p))
      9.885494 seconds (11.73 M allocations: 682.188 MiB, 2.96% gc time, 14.61% compilation time)
    
    julia>
    

Well, you might as well read the following documentation provided by PackageCompiler.jl :

Here is what I did to reporoduce the result written in above. Note that it will generate a sysimage sys_plots.so.

% julia -e 'using Pkg; Pkg.add(["Plots", "PackageCompiler"])'
% echo "using Plots; p = plot(rand(5), rand(5)); display(p)" > precompile_plots.jl
% julia -e 'using PackageCompiler; create_sysimage(:Plots, sysimage_path="sys_plots.so", precompile_execution_file="precompile_plots.jl")'
% ls
sys_plots.so # it is generated by PackageCompiler
% julia -q --sysimage sys_plots.so
% julia> @time using Plots
  0.000445 seconds (1.09 k allocations: 81.062 KiB, 648.57% compilation time)
% julia> @time (p = plot(rand(5), rand(5)); display(p))
  0.593317 seconds (222.42 k allocations: 21.338 MiB, 4.90% compilation time)
  • Yes, it’s actually done well on Julia REPL.

Utilize our sysimage

  • Can we utilize our sysimage sys_plots.so for running Julia on Jupyter Notebook/Lab?
  • The answer is yes. Before running Jupyter, please install Julia kernel via:
julia> using IJulia
julia> sysimage = joinpath(@__DIR__, "sys_plots.so")
julia> installkernel("Julia-sys-plots", "--project=@.", "--sysimage=$(sysimage)")
  • It will create a Julia kernel named Julia-sys-plots with option --projecrt=@. --sysimage=/path/to/sys_plots.so. See IJulia’s instructions to learn more.

  • OK. let’s run jupyter notebook and then create a Julia notebook its kernel is Julia-sys-plots 1.6.1 not standard Julia 1.6.1. You’ll find running using Plots; plot(rand(10), rand(10)) is so fast.

  • Satisfied? Hmm, have you thought … wait! Go to next.

Second motivation

  • Have you thought Jupyter(with IJulia kernel) starts up so slow? Imagine you want to restart Julia notebook via KernelRestart & Run All several times. You might be a little frustrating because it takes several seconds to make the first cell is updated even if it contains only very simple expression like 1+1.
  • Let’s say we have a Julia notebook named simple_math.ipynb which contains only single cell its content is 1+1. On Julia REPL, the following script will take about 17 seconds. (If you are missing jupytext command, please install via pip install jupytext or conda install jupytext -c conda-forge whatever)
julia> @time run(`jupytext --execute simple_math.ipynb`)
[jupytext] Reading simple_math.ipynb in format ipynb
[jupytext] Executing notebook with kernel julia-1.6
Starting kernel event loops.
[jupytext] Writing simple_math.ipynb (destination file replaced [use --update to preserve cell outputs and ids])
 17.560112 seconds (44 allocations: 1.875 KiB)
Process(`jupytext --execute simple_math.ipynb`, ProcessExited(0))

Note that if you switch the kernel of the notebook to Python, you’ll get:

julia> @time run(`jupytext --execute py_simple_math.ipynb`)
[jupytext] Reading py_simple_math.ipynb in format ipynb
[jupytext] Executing notebook with kernel python3
[jupytext] Writing py_simple_math.ipynb (destination file replaced [use --update to preserve cell outputs and ids])
  2.314238 seconds (44 allocations: 1.875 KiB)
Process(`jupytext --execute py_simple_math.ipynb`, ProcessExited(0))

Though it has a little overhead, it is much better than Julia.

  • In short, we STILL have a lot of room for improvement to reduce latency. In the next, I will provide a simple prescription to address our issue.

Prescription

  • All right let’s get started. Fisrt, install another Julia kernel to record/trace precompile statements:
installkernel("Julia-trace-nb", "--project=@.", "--trace-compile=traced_nb.jl")

Here we’ve used --trace-compile flag to output “precompilation statements” to a file see instructions for PackageCompiler to learn more.

  • Second, create notebook named nb.ipynb with Julia-trace-nb 1.6.1 kernel and write expression here as you want e.g. 1+1; using Plots; plot(rand(10), rand(10)) etc…

  • Third, execute a command below:

% jupytext --execute nb.ipynb # will generate traced_nb.jl

Since, we’ve used Juia kernel with option --trace-compile=traced_nb.jl, it will record precompilation statements in traced_nb.jl.

  • What should we do? Use create_sysimage of course. Note that there is a keyword argument precompile_statements_file that will accept "traced_nb.jl", above it, we’ve created.
julia> using PackageCompiler
julia> create_sysimage(
  [:Plots], # you can also add StatsPlots etc...
  sysimage_path="sys_plots_nb.so", 
  precompile_statements_file="traced_nb.jl" # important
)
  • Finally to test out the sysimage sys_plots_nb.so, install new kernel:
julia> using IJulia
julia> sysimage = joinpath(@__DIR__, "sys_plots_nb.so")
julia> installkernel("Julia-sys-plots-nb", "--project=@.", "--sysimage=$(sysimage)")

That’s it. Try to create/run Jupyter notebook with Julia-sys-plots-nb.

You’ll find @time run(jupytext --execute simple_math-with-sys_plots_nb.ipynb) tend to be better than @time run(jupytext --execute simple_math.ipynb)

sysimage_creator

There are a lot of things to do manually to setup… Do not worry! sysimage_creator will save your time to construct environment. Have a try!

git clone https://github.com/terasakisatoshi/sysimage_creator.git
cd sysimage_creator
make # create sysimage
make test # test out !!!
7 Likes

Lovely, thanks for sharing!

When I tried to --trace-compile an IJulia session (manually, not with jupytext) in my own setup earlier, the resulting sysimg could not be used in a kernel (the kernel died immediately). Have you encoutered this issue too when developing this?

Another question, why do you choose include_transitive_dependencies = false in create_sysimage, is that necessary for it to work?

Hi @tfiers thank you for finding this article.

the resulting sysimg could not be used in a kernel (the kernel died immediately).

Did you set --sysimage option appropriately?

See sysimage_creator/installkernel.jl at b68e4e608db734c3ceeb0b0f463116a879fad099 · terasakisatoshi/sysimage_creator · GitHub

As for me, the following commands (described in README) works for me.

$ cd /path/to/sysimage_creator
$ make

Another question, why do you choose include_transitive_dependencies = false in create_sysimage , is that necessary for it to work?

See the comment in this issue Lazy downloading artifacts with custom sysimage · Issue #639 · JuliaLang/PackageCompiler.jl · GitHub

It seems MKL_jll downloads when jupyter kernel launches if not set include_transitive_dependencies=false

1 Like

I see, thank you.

Yes, I succesfully used an IJulia kernel with a sysimage before; it’s just when I used my sysimage with IJulia-traced precompile statements in it that these crashes happened.

I will try your scripts next time I build a system image, and see if it works for me
(and if it does, narrow down what the difference is with what I did before).

1 Like