What next after SnoopCompile?

Jack_Coughlin · March 19, 2023, 9:07pm

I’m working on reducing TTFX for my project using SnoopCompile.jl.

I have eliminated all relevant invalidations according to precompile_blockers
@snoopi_deep reports the following for my workload:

julia> tinf
InferenceTimingNode: 0.899820/0.965381 on Core.Compiler.Timings.ROOT() with 22 direct children

The resulting inference flamegraph is almost entirely blank space, which I understand is attributable to code generation

Despite this progress, there is still an unexplained 10-second pause in my workload.

I’m doing something like the following:

# Redefine a very small module which imports my package
@time include("small_module.jl")
@info "done with module loading"
@time SmallModule.run_workload()

This results in the following output:

  4.013175 seconds (21.86 M allocations: 1.273 GiB, 5.50% gc time, 7.90% compilation time: 3% of which was recompilation)
[ Info: done with module loading
<<<<< Ten second pause >>>>>
  0.919104 seconds (359.83 k allocations: 254.819 MiB, 6.49% gc time, 23.76% compilation time)

The four seconds is due to using MyPackage inside of SmallModule.
The 0.9 seconds roughly tracks my estimate of how long the workload really takes, based on logging.

Questions

Is the 10-second pause already accounted for in the blank space in @snoopi_deep’s flamegraph, but excluded from @time?
If not, is it still possibly attributed to codegen?
How can I go about debugging what Julia is doing during this ten second pause?

SnoopCompile script for reference

using SnoopCompileCore

@time invalidations = @snoopr begin 
    include("small_module.jl")
end

using SnoopCompile

@time tinf = @snoopi_deep begin
    SmallModule.run_workload()
end

trees = invalidation_trees(invalidations)
blockers = precompile_blockers(trees, tinf)

xiaodai · March 19, 2023, 10:01pm

I have not kept up in this area. Where did u learn that I have to do that? I want to learn how to do these things and prep my package.

tim.holy · March 20, 2023, 9:44am

It’s a bit hard to guess exactly what’s going on here since most of the code is missing. Here are some relevant questions:

are you using Julia 1.9? Precompilation is “fixed” on 1.9 but incomplete on previous Julia releases.
is nearly all your “real” code in one or more packages that you load with using MyPackage (I think you said as much, but worth being sure)
do all your packages have a precompilation workload? For most packages you want to use SnoopPrecompile, but if your code is really well-inferred then it’s not necessary
in your script, what happens if you run the @snoopi_deep step before loading SnoopCompile? SnoopCompile can (and does) invalidate code and force recompilation; that’s why SnoopCompileCore exists, to allow you to collect the data before you load “dangerous” code that allows you to do the analysis. You should run both @snoopr and @snoopi_deep before loading SnoopCompile. Just be sure to suppress printing of tinf since that can be very long before SnoopCompile loads. (SnoopCompile defines the relevant show methods; they are not in SnoopCompileCore because defining new methods for Base functions is the best way to invalidate poorly-inferred code).

AMJ · March 20, 2023, 10:37am

Here is a great one explained throughly.

Jack_Coughlin · March 20, 2023, 2:50pm

Thanks for the help, Tim.

Yes, using Julia 1.9-rc1
Yes, all my “real” code is in a package that is loaded via using MyPackage in the SmallModule.
I have added precompilation workloads as the @snoopi_deep indicated to me that they would be useful.

The situation I’m in now which makes it unclear how to proceed is that the output of @snoopi_deep looks really “good”:

The root InferenceTimingNode prints as follows:

julia> tinf
InferenceTimingNode: 0.965624/1.002108 on Core.Compiler.Timings.ROOT() with 21 direct children

All of those inference roots are for “setup” code, stuff like Base.merge!(Dict{String, Any}, Dict{String, Any}), not the main part of my workload. On top of this, the output of staleinstances and precompile_blockers are both empty.

Looking at the invalidation trees again, there are some things that I should probably be concerned about. However, I can’t see how these show up in the inference flamegraph. My mental model for how invalidations affect compilation time is:

Invalidation -> Inference -> Code generation

As I understand it, any invalidation which results in significant compilation time should show up in the @snoopi_deep flamegraph. Is that mistaken?

In the meantime I will keep plugging away on those invalidations and see if they fix the problem, but I would love to understand this better.

Jack_Coughlin · March 20, 2023, 3:26pm

Update: I removed a dependency on Polynomials.jl, some of the methods of which were getting invalidated. This seems to have fixed the situation where the bulk of the compilation of my workload was not getting measured by @snoopi_deep; now the inference graph looks much worse, but the real time spent has slightly decreased. Now there are some meaningful inference roots to continue tackling.

As a data point, I was using Polynomials inside the “compile-time” part of some @generated functions.

tim.holy · March 20, 2023, 6:21pm

I think your understanding is correct. If all of your remaining compilation is just due setup code, then you can create one or more personal StartupX.jl packages that also precompile the setup code.

tim.holy · March 20, 2023, 6:24pm

Ah, interesting. Can you paste that portion of the code here? Indeed it’s possible that for @generated functions we have an incomplete story (I’m unsure). We don’t track backedges for the generator, for example.

Topic		Replies	Views
New tools for reducing compiler latency Package Announcements	7	4029	January 28, 2021
Making sense of output of Snoopcompile Tooling	0	276	September 2, 2021
Understanding precompilation and its limitations (reducing latency) Internals & Design precompilation	13	2707	December 31, 2020
How to troubleshoot when precompilation takes a long time? General Usage	1	444	July 31, 2019
Where are those 3 sec TTFP hiding? General Usage	14	509	May 10, 2023

What next after SnoopCompile?

Questions

SnoopCompile script for reference

Related topics