Slow Julia startup time after sysimage creation (and an unbelievable observation!)

Yes! You are right! Then it is much better/closer!

I this case the only thing remained, “let’s tell everyone, if you build it twice you get 4-6x faster startup!” :smiley:
Which is GAMECHANGER already! :smiley:

Well, for someone working on PackageCompiler, understanding why this happens would also be one thing remaining…

It would be useful to figure out if this effect is mostly due to a specific package (perhaps Zygote) and then if there is some specific thing that the package does that causes it.

1 Like

Just stumbled upon this interesting thread and decided to see if I can also see the speedup effect when using the double sysimg compile. And I do. Based on @Marcell_Havlik’s script I stripped away some of the larger packages like Zygote and Flux to only leave this (note that this also gets rid of the replacement of the default sysimg and the precompile execution script):

$ cat sysimage_awesome.jl
import Pkg;
installed_pkgs = Pkg.installed()
function add_pkg(name; kw...)
        !(haskey(installed_pkgs, name) && (!(:version in keys(kw)) || installed_pkgs[name] == kw[:version])) && Pkg.add(Pkg.PackageSpec(;name=name, kw...))
end 
add_pkg("Revise")
add_pkg("BenchmarkTools")
add_pkg("Distributions")
add_pkg("OhMyREPL")
add_pkg("EllipsisNotation")
add_pkg("LanguageServer")
add_pkg("Parameters")
add_pkg("JSON2")
add_pkg("JuliaFormatter")
add_pkg("LightGraphs")
add_pkg("GraphPlot")
add_pkg("Plots")
add_pkg("PackageCompiler")
# SymbolServer
using PackageCompiler
@show "Build starting..."

create_sysimage([:Revise, :JuliaFormatter, 
                :LanguageServer, 
                :PackageCompiler,
                :EllipsisNotation, :Parameters, :JSON2,
                :Distributions,
                :OhMyREPL,
                :Plots,
                ];
        replace_default=false,
        sysimage_path="mysysimage.so")

Then compiling the sysimg twice, the second time using the first sysimg:

$ julia sysimage_awesome.jl 
┌ Warning: Pkg.installed() is deprecated
└ @ Pkg /build/julia/src/julia-1.6.1/usr/share/julia/stdlib/v1.6/Pkg/src/Pkg.jl:566
"Build starting..." = "Build starting..."
[ Info: PackageCompiler: creating system image object file, this might take a while...

$ ls -l *.so
-rwxr-xr-x 1 melis users 315158440 Jun 23 11:39 mysysimage.so
$ mv mysysimage.so mysysimage.first.so

$ julia -J mysysimage.first.so sysimage_awesome.jl 
┌ Warning: Pkg.installed() is deprecated
└ @ Pkg /build/julia/src/julia-1.6.1/usr/share/julia/stdlib/v1.6/Pkg/src/Pkg.jl:566
"Build starting..." = "Build starting..."
[ Info: PackageCompiler: creating system image object file, this might take a while...
$ mv mysysimage.so mysysimage.second.so

$ ls -l *.so
-rwxr-xr-x 1 melis users 315158440 Jun 23 11:39 mysysimage.first.so
-rwxr-xr-x 1 melis users 316299536 Jun 23 11:44 mysysimage.second.so

Differences in runtime (for the second time running these commands):

$ time julia -J mysysimage.first.so -E 'print("Julia $(VERSION) with $(length(Base.loaded_modules)) module");'
Julia 1.6.1 with 137 modulenothing

real	0m0.691s
user	0m0.559s
sys	0m0.130s

$ time julia -J mysysimage.second.so -E 'print("Julia $(VERSION) with $(length(Base.loaded_modules)) module");'
Julia 1.6.1 with 137 modulenothing

real	0m0.306s
user	0m0.192s
sys	0m0.106s

Of course, this is a much smaller speedup than OP observes, but it’s still interesting to figure out why this occurs.

Differences in precompile statements:

$ julia -J mysysimage.first.so -E 'print("Julia $(VERSION) with $(length(Base.loaded_modules)) module");' --trace=stderr
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{URIs.RegexAndMatchData, 1}, URIs.RegexAndMatchData})
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{Base.MPFR.BigFloat, 1}, Base.MPFR.BigFloat})
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{Base.GMP.BigInt, 1}, Base.GMP.BigInt})
precompile(Tuple{ArrayInterface.var"#53#80"})
precompile(Tuple{ArrayInterface.var"#56#83"})
precompile(Tuple{ArrayInterface.var"#110#114"})
precompile(Tuple{ArrayInterface.var"#111#115"})
precompile(Tuple{ArrayInterface.var"#112#116"})
precompile(Tuple{typeof(Base.Libc.Libdl.dlopen), String, UInt32})
precompile(Tuple{typeof(Base.invokelatest), Any})
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{Array{Grisu.Bignums.Bignum, 1}, 1}, Array{Grisu.Bignums.Bignum, 1}})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, PlotThemes.PlotTheme}, PlotThemes.PlotTheme, Symbol})
precompile(Tuple{typeof(Base.merge), NamedTuple{(), Tuple{}}, Base.Dict{Symbol, Any}})
precompile(Tuple{typeof(Base.foreach), Function, Array{Base.Dict{Symbol, Any}, 1}, Array{Base.Dict{Symbol, Any}, 1}})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Nothing, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Tuple{Int64, Int64}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Base.Dict{Any, Any}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, ColorTypes.RGB{FixedPointNumbers.Normed{UInt8, 8}}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Float64, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Array{Any, 1}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Array{Float64, 1}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Measures.Length{:mm, Float64}, Symbol})
precompile(Tuple{typeof(Base.copyto!), Array{Symbol, 1}, Base.KeySet{Symbol, Base.Dict{Symbol, Any}}})
precompile(Tuple{typeof(Plots.treats_y_as_x), Symbol})
precompile(Tuple{typeof(Base.length), Base.Dict{Base.PkgId, Module}})
precompile(Tuple{typeof(Base.string), String, Base.VersionNumber, Vararg{Any, N} where N})
Julia 1.6.1 with 137 moduleprecompile(Tuple{typeof(Base.show), Nothing})
precompile(Tuple{typeof(Base.show), Base.TTY, Nothing})
nothing
precompile(Tuple{typeof(FileWatching.uvfinalize), FileWatching.FileMonitor})

$ julia -J mysysimage.second.so -E 'print("Julia $(VERSION) with $(length(Base.loaded_modules)) module");' --trace=stderr
precompile(Tuple{ArrayInterface.var"#110#114"})
precompile(Tuple{ArrayInterface.var"#111#115"})
precompile(Tuple{ArrayInterface.var"#112#116"})
precompile(Tuple{typeof(Base.length), Base.Dict{Base.PkgId, Module}})
precompile(Tuple{typeof(Base.string), String, Base.VersionNumber, Vararg{Any, N} where N})
Julia 1.6.1 with 137 moduleprecompile(Tuple{typeof(Base.show), Nothing})
precompile(Tuple{typeof(Base.show), Base.TTY, Nothing})
nothing
precompile(Tuple{typeof(FileWatching.uvfinalize), FileWatching.FileMonitor})
Julia Version 1.6.1
Commit 6aaedecc44* (2021-04-23 05:59 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.1.0 (ORCJIT, haswell)
1 Like

I can give you even extreme helps.
I don’t know when did I did this but I made some advancements and I copy a part of my scripts. I hope it works perfectly for you too!

import Pkg;
import Base

ENV["JULIA_DEBUG"] = "CUDA" # for better CUDA error messages.

Base.:*(a::Vector{String}, b::Vector{String}) = vcat(a, b) 
Base.:*(a::Vector{Pkg.Types.PackageSpec}, b::Vector{Pkg.Types.PackageSpec}) = vcat(a, b) 

packages = ["Revise"]
packages *= ["BenchmarkTools", "Zygote"]
packages *= ["Flux", "CUDA"]
packages *=["AMDGPU"]
packages *= ["Distributions", "EllipsisNotation", "Parameters", "FastClosures"]
packages *= ["JSON2", "JSON3","CSV", "JLD2", "WebSockets", "HTTP", "FileIO"]  # , "JLD", "Redis"
# packages *= ["GraphRecipes"]  
packages *= ["LightGraphs", "GraphPlot", "Colors", "Crayons", "NetworkLayout"]
packages *= ["DataStructures"]  # , "ORCA"
packages *= ["PlotlyBase", "Plots"]
packages *= ["KernelDensity", "GeometryTypes", "GeometryBasics", "DataFrames"]
packages *= ["Distributed", "ClusterManagers","Memoize", "Formatting"] # , "LoopVectorization"
packages *= ["AbstractPlotting"]
packages *= ["Makie", "GLMakie"]
packages *= ["PackageCompiler", "PyCall", "JuliaFormatter"] 
packages *= ["Glob", "LRUCache"]  # , "KernelDensity", "StatsPlots"
packages *= ["MethodAnalysis", "SnoopCompile", "SnoopCompileCore", "AbstractTrees", "CodeTracking"]  # , "ProfileView"
packages *= ["FiniteDifferences"]
packages *= ["DiffRules", "ChainRulesCore", "OrderedCollections", "JuliaInterpreter", "MacroTools"]
Pkg.add(packages)
println("Building packages...")
Pkg.build()

using PackageCompiler
precompileable_pkgs = 
[
  :Revise,  

  # :PrecompilePkg,
  
  :RelevanceStacktrace,

  :PackageCompiler,
  :Distributed,

  :PyCall,

  :Distributions,
  :Flux,
  :Zygote,

  # :CUDA,
  :Colors,
  :Crayons,

  :DataStructures,
  :JSON2,
  :JSON3,
  :JLD2,
  :FileIO,
  :CSV,
  :HTTP,
  :WebSockets,

  :EllipsisNotation,
  :Parameters,
  :Memoize,
  :Glob,

  :MethodAnalysis,
  :SnoopCompile,
  :BenchmarkTools,

  :Formatting,
  :JuliaFormatter,

  # :Plots,
  :GraphPlot, 
  :LightGraphs,
  :NetworkLayout,
  :PlotlyBase,
  :GeometryTypes,
]

syimage_name = "Julia17PreSysimageOpt"
syimage_name_1 = "$(syimage_name)7.so"
syimage_name_2 = "$(syimage_name)8.so" # The second image is double times faster...
println("Building 1. sysimage... ($syimage_name_1)")
create_sysimage(precompileable_pkgs; replace_default = false, sysimage_path="$(syimage_name_1)") # precompile_execution_file = "precompile.jl",
println("Build 1. Done! ($syimage_name_1)")

println("Build 2 started... ($syimage_name_2)")
cmd = """using PackageCompiler; create_sysimage($precompileable_pkgs; replace_default = false, sysimage_path="$(syimage_name_2)")"""
cmd = "'$cmd'"
run(`julia -J$(syimage_name_1) -e $(cmd[2:end-1])`)
println("Build 2 Done! ($syimage_name_2)")

This does compile the image with the image you compiled and it gets the speedup. :slight_smile:
Yeah it would be nice to figure out why do this speedup happen, but still then I enjoy the benefits. :smiley:

Then I just add the vscode config:

  "julia.additionalArgs": [
    "-J/home/computername/repo/url/Julia17PreSysimageOpt8.so"
  ],

in the end. (Btw I could do it automatically from the script.) Idk why I didn’t automate this yet… maybe for safety :smiley:

Some more small tests of the form:

$ cat genimg.jl
using PackageCompiler

create_sysimage([<packages>]; replace_default=false, sysimage_path="mysysimage.so")

$ julia genimg.jl
$ mv mysysimage.so mysysimage.first.so
$ julia -J mysysimage.first.so genimg.jl
$ mv mysysimage.so mysysimage.second.so
# Differences in runtime (first -> second, run more than once, take lowest)
$ time julia -J mysysimage.first.so -e ''
$ time julia -J mysysimage.second.so -e ''
# Differences in precompile statements 
$ julia -J mysysimage.first.so -e '' --trace=stderr
$ julia -J mysysimage.second.so -e '' --trace=stderr

Revise, JuliaFormatter, LanguageServer, PackageCompiler, OhMyREPL
0.248s → 0.174s

Missing with mysysimage.first.so:
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{URIs.RegexAndMatchData, 1}, URIs.RegexAndMatchData})
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{Base.MPFR.BigFloat, 1}, Base.MPFR.BigFloat})
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{Base.GMP.BigInt, 1}, Base.GMP.BigInt})
precompile(Tuple{typeof(Base.invokelatest), Any})

(some of those resize_nthreads! calls don’t really make sense, do they? E.g. BigFloat or RegexAndMatchData?)

Revise, PackageCompiler
0.217s → 0.122s

Missing with mysysimage.first.so:
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Base.PkgId, Revise.PkgData}, Revise.PkgData, Base.PkgId})

Revise
0.206s → 0.118s

Missing with mysysimage.first.so:
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Base.PkgId, Revise.PkgData}, Revise.PkgData, Base.PkgId})

Plots
0.512s → 0.156s

Missing with mysysimage.first.so:
precompile(Tuple{typeof(Base.Libc.Libdl.dlopen), String, UInt32})
precompile(Tuple{typeof(Base.invokelatest), Any})
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{Base.MPFR.BigFloat, 1}, Base.MPFR.BigFloat})
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{Base.GMP.BigInt, 1}, Base.GMP.BigInt})
precompile(Tuple{typeof(Base.Threads.resize_nthreads!), Array{Array{Grisu.Bignums.Bignum, 1}, 1}, Array{Grisu.Bignums.Bignum, 1}})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, PlotThemes.PlotTheme}, PlotThemes.PlotTheme, Symbol})
precompile(Tuple{typeof(Base.merge), NamedTuple{(), Tuple{}}, Base.Dict{Symbol, Any}})
precompile(Tuple{typeof(Base.foreach), Function, Array{Base.Dict{Symbol, Any}, 1}, Array{Base.Dict{Symbol, Any}, 1}})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Nothing, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Tuple{Int64, Int64}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Base.Dict{Any, Any}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, ColorTypes.RGB{FixedPointNumbers.Normed{UInt8, 8}}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Float64, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Array{Any, 1}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Array{Float64, 1}, Symbol})
precompile(Tuple{typeof(Base.setindex!), Base.Dict{Symbol, Any}, Measures.Length{:mm, Float64}, Symbol})
precompile(Tuple{typeof(Base.copyto!), Array{Symbol, 1}, Base.KeySet{Symbol, Base.Dict{Symbol, Any}}})
precompile(Tuple{typeof(Plots.treats_y_as_x), Symbol})

PackageCompiler
Missing with mysysimage.first.so:
none!

1 Like

So do I understand well, the second time there are less precompiled directives that goes to stderr?

Yes, in all cases there are no precompiled directives printed when using the second sysimg