Best practices for profiling compilation

bkamins · October 12, 2020, 6:21pm

In DataFrames.jl we see a strong trend that people want more and more powerful functions allowing for transformation of data frames (to match dplyr, data.table, pandas, …).

This leads to a situation when code base becomes really complex - which means long compilation. Additionally transformation functions most often take anonymous functions as arguments (so this is a “run once” case very often - most of the time the user will not call the transformation twice with the same argument types as every anonymous function has its own new type).

We experiment with various options how to use e.g. @nospecialize to reduce compilation latency, while ensuring that the code runs fast for large data.

However the thing I struggle with is a convenient way to check what and how gets compiled when a call to a function with a new signature happens. It would be great to hear back from the experts in the field what are the recommended approaches to it (currently I do it mostly based on my general understanding how things work and trial-and-error).

Thank you!

tim.holy · October 13, 2020, 1:40pm

Didn’t see this before I posted Understanding precompilation and its limitations (reducing latency), but that addresses some of what you’re asking about.

bkamins · October 13, 2020, 1:51pm

Thank you!

tim.holy · October 13, 2020, 2:31pm

Also, just in general terms this is pretty much the main purpose of SnoopCompile. @snoopi measures inference, @snoopc codegen (native code generation), and @snoopr invalidations.

tim.holy · October 13, 2020, 2:46pm

One more thing: it looks like DataFrames/CategoricalArrays still invalidates an awful lot of MethodInstances. If you invalidate something that you also depend on, good luck getting precompilation to work.

Even if you don’t add any precompile directives, eliminating invalidations is also probably your easiest way to reduce latency. Anything you invalidated was already compiled, and if some of those MethodInstances are also useful to DataFrames then you’re just going to have to compile them again. If you can avoid that, you save both the inference time and the codegen time. (precompile directives currently only save you inference time; nice, but it’s even better if you can save both.)

bkamins · October 13, 2020, 3:40pm

The first step is to remove CategoricalArrays.jl as a dependency of DataFrames.jl and we are almost there on master already. Thank you for looking into this.

Topic		Replies	Views
Understanding precompilation and its limitations (reducing latency) Internals & Design precompilation	13	2679	December 31, 2020
New tools for reducing compiler latency Package Announcements	7	4006	January 28, 2021
Precompiling non-inferrable calls Performance compilation , precompilation , snoopcompile , ttfp , ttfx	2	513	April 23, 2022
Help reducing compilation time General Usage	8	477	March 13, 2022
Tools to Analyze Long Julia Compilation Times New to Julia question , compilation	4	730	February 1, 2024

Best practices for profiling compilation

Related topics