Shouldn't 1.8.0 be faster than Julia 1.7?

However, it’s still symptomatic that 1.8 consistently takes longer

  | | |_| | | | (_| |  |  Version 1.8.0-rc3 (2022-07-13)
 _/ |\__'_|_|_|\__'_|  |  Official release
|__/                   |

julia> @time (1:10^9).^2;
  3.972610 seconds (2 allocations: 7.451 GiB, 0.11% gc time)


λ C:\programs\julia-1.7\bin\julia
   _       _ _(_)_     |  Documentation:
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.0 (2021-11-30)
 _/ |\__'_|_|_|\__'_|  |  Official release
|__/                   |

julia> @time (1:10^9).^2;
  1.903721 seconds (2 allocations: 7.451 GiB, 0.62% gc time)

Not for me. On Shouldn't 1.8.0 be faster than Julia 1.7? - #20 by tim.holy it’s


julia> @bprofile f()
BenchmarkTools.Trial: 8658 samples with 1 evaluation.
 Range (min … max):  461.459 μs …  1.201 ms  ┊ GC (min … max): 0.00% … 25.41%
 Time  (median):     560.666 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   574.060 μs ± 89.722 μs  ┊ GC (mean ± σ):  4.29% ±  8.08%

    ▁██▄▃      ▆▆▃▃                                             
  ▁▂█████▇▃▂▂▃██████▄▃▄▄▄▄▄▃▃▂▂▂▂▃▄▄▄▃▃▃▂▂▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  461 μs          Histogram: frequency by time          886 μs <

 Memory estimate: 7.63 MiB, allocs estimate: 2.


julia> @bprofile f()
BenchmarkTools.Trial: 7688 samples with 1 evaluation.
 Range (min … max):  452.503 μs …   1.562 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     673.130 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   645.455 μs ± 143.323 μs  ┊ GC (mean ± σ):  4.75% ± 9.26%

     ▅█▄                  ▁▂▂                                    
  ▁▂▆███▆▃▂▂▂▂▂▁▁▁▁▂▂▃▃▃▄▆████▄▃▂▂▂▂▂▂▂▂▂▃▃▃▃▃▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁ ▂
  453 μs           Histogram: frequency by time         1.04 ms <

 Memory estimate: 7.63 MiB, allocs estimate: 2.

It does, however, suggest the GC sweeps are less frequent but take longer. That’s a reasonable tradeoff.

On the “benchmark” with 10^9 samples, it depends on how many browser tabs you have open, whether you switched windows between profiling Julia and checking your slack, whether you remembered to close the 1.7 window when you ran 1.8, etc. Unless you really know what you’re doing, and your intent is to profile, e.g., your OS swap behavior, the 10^9 case is 100%, entirely, completely, utterly useless. Everyone should please stop worrying about it.


For some less synthetic benchmarks, I have been tracking the benchmarks results of some of my packages over julia versions at QuantumClifford Benchmarks.

For that package 1.9-nightly is better than 1.8-beta which is better than 1.6.0 and 1.7.0. Thus, at least one person is really happy with the improvements in the Julia runtime and compiler.

There have also been nice improvements in TTFX, but they have been negated by some packages going out of Base and having to be loaded as well.


I would if this was the only case I know (I understand your point) but it isn’t

  | | |_| | | | (_| |  |  Version 1.7.0 (2021-11-30)
 _/ |\__'_|_|_|\__'_|  |  Official release
|__/                   |

julia> @time using GMT
[ Info: Precompiling GMT [5752ebe1-31b9-557e-87aa-f909b540aa54]
 24.749264 seconds (4.43 M allocations: 267.346 MiB, 0.42% gc time, 4.02% compilation time)

julia> tic(); plot(rand(5,2)); toc()
elapsed time: 7.3018712 seconds
julia> @time using GMT
[ Info: Precompiling GMT [5752ebe1-31b9-557e-87aa-f909b540aa54]
 26.793838 seconds (3.79 M allocations: 235.745 MiB, 0.36% gc time, 4.47% compilation time: 49% of which was recompilation)

julia> tic(); plot(rand(5,2)); toc()
elapsed time: 10.7382274 seconds

and these are actually 1.8 friendly. The compile time is normally on ~32 sec but today it decided to be a bit faster.

I know of your work in

and I’m eager to see if it fixes this time regression :slight_smile:

I’ve been working on that case for 2 straight weeks :-). almost fixes it. There’s still a 1s regression (EDIT: you can shave this to 0.5s if you use but it’s no longer due to any of the precompilation improvements, and in fact they make your life better. But lots of things change from version to version; LLVM has changed its performance profile too, and you’re one of the unlucky people for whom the net effect is worse. Sorry.


But if you run this many times, in different states of your pc, do you still see consistent behaviour?

Oh, yes. I do that dozens of times per day (specially the precompile).

And we are all very much indebted to your work on this side of Julia.


This is a bit of a tangent, but WOW, this tool looks amazing SnoopPrecompile · SnoopCompile and your message here is the first time I hear about it

That’s because it only got submitted as a new package this morning, and it will be Monday before it’s actually released.

The package itself is really simple, but it’s designed to make precompilation less finicky. It only does 3 things: (1) run the block only when precompilling, (2) disable the interpreter when running the block (to ensure everything gets compiled), and (3) intercept runtime dispatch to force precompilation of calls to methods defined in other packages. Each of these is in itself just a few lines, and all the supporting infrastructure for this has existed for a while now (although I guess that depends on perspective, most of it’s not in 1.7, but all will be in 1.8). The package just combines them in an easy-to-use wrapper that hopefully means people will be able to get high-quality precompilation without being an expert.

There’s one exception: invalidation will remain a threat for the foreseeable future, and that still takes some expertise to diagnose and fix. I don’t yet have any great ideas about making this easier, but maybe someone else will come up with something.


julia> @bprofile f()
BenchmarkTools.Trial: 5650 samples with 1 evaluation.
Range (min … max): 341.958 μs … 7.635 ms ┊ GC (min … max): 0.00% … 88.89%
Time (median): 764.125 μs ┊ GC (median): 0.00%
Time (mean ± σ): 882.770 μs ± 373.281 μs ┊ GC (mean ± σ): 14.80% ± 18.36%


▄▂▂▁▂▁▁▅███████▇▆▅▄▃▃▂▃▃▃▃▃▃▃▃▃▃▃▃▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
342 μs Histogram: frequency by time 2.39 ms <

Memory estimate: 7.63 MiB, allocs estimate: 2.