The timing for (not unofficial) 1.9 master shows 7.6% faster startup than for Julia 1.7.0:
$ hyperfine '~/Downloads/julia-a60c76ea57/bin/julia -e ""'
Benchmark 1: ~/Downloads/julia-a60c76ea57/bin/julia -e ""
Time (mean ± σ): 191.2 ms ± 14.4 ms [User: 179.9 ms, System: 303.4 ms]
Range (min … max): 168.6 ms … 213.0 ms 14 runs
$ hyperfine 'julia -e ""'
Benchmark 1: julia -e ""
Time (mean ± σ): 206.0 ms ± 29.6 ms [User: 165.5 ms, System: 122.3 ms]
Range (min … max): 182.4 ms … 278.2 ms 10 runs
Since the “System” time is more than the total time, it implies (2?) threads used. And note, with the timing below using time
I get very different numbers for “sys”, so I assume it might mean average thread time for here presumably 2 threads.
How could the startup time be reduced further? I was thinking about compiling my own Julia (already done, not shown here), and throwing out as much as possible, stuff not needed, such as LinearAlgebra, maybe Threads and basically everything from Base, not used by Julia itself.
Note, already thrown out in Julia 1.9 from the sysimage are e.g. Statistics, DelimitedFiles, or so I thought. Whatever the reason(s) for the speedup, that and/or some other, the sysimage is actually larger:
232731608 jún 25 17:00 sys.so
32830120 jún 25 16:48 libopenblas64_.0.3.20.so
vs in 1.7.0:
199483960 nóv 30 2021 sys.so
31736520 nóv 30 2021 libopenblas64_.0.3.13.so
Those are the big-ticket items to reduce (sys.so), or eliminate e.g. LinearAlgebra/libopenblas64. I’ve yet to profile anything (I recall from JuliaLang issue, it’s been done). Can anyone tell me where to look in the code about removing e.g. that .so and best tools to profile, or point to that forgotten issue.
A. I’m thinking of doing this unofficial (breaking) Julia 2.0, not as a hostile takeover, but to explore how much can and should be taken out, but still be useful for scripts and benchmarks such as Debian Benchmark Game (some scripts there require threads… at least one GMP/BigInt, but none LinearAlgebra).
B. I’m also considering implementing some of the changes from the 2.0 milestone (any ideas?), some that seems sensible, at least if faster, and also removing Dict from Base… i.e. changing to a better (for Julia) unexported version. I suspect it only needs small Dict
s, not a scalable Dict implementation.
$ time julia --startup-file=no -O0 -e "println(\"Hello world\")"
Hello world
real 0m0,216s
user 0m0,194s
sys 0m0,069s
$ time ~/Downloads/julia-a60c76ea57/bin/julia --startup-file=no -O0 -e "println(\"Hello world\")"
Hello world
real 0m0,192s
user 0m0,160s
sys 0m0,156s
$ hyperfine '~/Downloads/julia-a60c76ea57/bin/julia --startup-file=no -O0 -e "println(\"Hello world\")"'
Benchmark 1: ~/Downloads/julia-a60c76ea57/bin/julia --startup-file=no -O0 -e "println(\"Hello world\")"
Time (mean ± σ): 190.1 ms ± 14.6 ms [User: 177.9 ms, System: 291.8 ms]
Range (min … max): 170.7 ms … 208.6 ms 14 runs
$ hyperfine 'julia --startup-file=no -O0 -e "println(\"Hello world\")"'
Benchmark 1: julia --startup-file=no -O0 -e "println(\"Hello world\")"
Time (mean ± σ): 213.0 ms ± 17.8 ms [User: 183.2 ms, System: 140.9 ms]
Range (min … max): 184.9 ms … 236.4 ms 12 runs