Taking TTFX seriously: Can we make common packages faster to load and use

How does one check the effect of precompilation on TTFX? Just time using with and without?

9 posts were split to a new topic: Why isn’t size always inferred to be an Integer?

@PetrKryslUCSD im using TTFX like time-to-first-plot. The time to load a package + the time to do its main thing X that it does, using @time thefunction(x).

My PR above made blinks load in 1s, but Window() takes 15 so its still not a huge gain.

1 Like

Yeah SnoopCompile is more complicated. I mostly just use ProfileView.jl and Cthulhu.jl and give up after that. Anything really bad is pretty obvious in ProfileView. But I would also like to know how to use SnoopCompile better.

1 Like

So, inspired by this thread I decided to ProfileView the TTFX for my package, DFTK, which currently stands at a ridiculous 90s. About 1/5 of that time is in inference, without reference to package code. Some places I can recognize are functions that have nested functions, but that is a pattern I use a lot and I’m not sure how to do without (and not sure if that’s even the problem). A lot of the advice I see for reducing TTFP are about method redefinitions, but I don’t think I redefine many base functions. I’d appreciate any advice.

3 Likes

I wander if the compiler could read assertions, though, and use that information.

Question more on the subject: Can someone provide a clear step by step of what they are doing to produce these flamegraphs and benchmarking the TTFX? It is not clear to me how to do that, given that the first execution of the profile has to be discarded, and on the second run well, it is not anymore the first run (at least the one from VSCode these are the instructions).

So let it also solve this issue :smiling_imp:

A post was merged into an existing topic: Why isn’t size always inferred to be an Integer?

You actually want to time and/or profile the first run. So start a fresh session, and:

using ProfileView
@profile @time using SomePackage
ProfileView.view()
@profview @time themainfunction(x)

Although profiling slows things down a bit. I also found ProfileView struggles with really huge load times and just @profile can be better. From that you can spot obvious problems, what code and packages actually take time to load, etc.

Edit:

And also to look at the functions taking inference time:

using SnoopCompile
tinf = @snoopi_deep themainfunction(x)
fg = flamegraph(tinf)
ProfileView.view(fg)

Then:

using Cthulhu
@descend themainfunction(x)

And look around to find the unstable (red) bits that looked slow to load in ProfileView, and anything else that seemed like a problem. Im sure there are more sophisticated ways to do this, but it gets you a long way.

11 Likes

I had understood that this can profile the compilation of the profiler as well. Doesn’t it?

I assumed @profile was compiled in the base Julia image.

1 Like

As a developer not very familiar with the subject “what should I write additionally to use my packages faster”, it would be nice if I have just a simple build flag that says “just precompile this package from all of its own tests, and don’t care the excessive code”.

3 Likes

Don’t know how realistic is this, but definitely a good idea.

This is the prof view for FinEtools.


So I think this is telling me that either I get rid of abstract types, or live with the using time of 15 seconds. The big block in the middle is all type inference…

Did you try to force precompilation of that piece?

If you are asking do I have __precompile__(true) in the top module, the answer is yes; otherwise, if you mean do I have targeted instructions in the code, the answers no.

Try calling that code at using time and see if it can precompile.

Sorry: which code?

The function the profile is pointing to.

Sorry about the confusion: I did

using ProfileView
@profview @time FinEtools

There is no “main” function per se.