Best practices for profiling compilation

In DataFrames.jl we see a strong trend that people want more and more powerful functions allowing for transformation of data frames (to match dplyr, data.table, pandas, …).

This leads to a situation when code base becomes really complex - which means long compilation. Additionally transformation functions most often take anonymous functions as arguments (so this is a “run once” case very often - most of the time the user will not call the transformation twice with the same argument types as every anonymous function has its own new type).

We experiment with various options how to use e.g. @nospecialize to reduce compilation latency, while ensuring that the code runs fast for large data.

However the thing I struggle with is a convenient way to check what and how gets compiled when a call to a function with a new signature happens. It would be great to hear back from the experts in the field what are the recommended approaches to it (currently I do it mostly based on my general understanding how things work and trial-and-error).

Thank you!

8 Likes

Didn’t see this before I posted Understanding precompilation and its limitations (reducing latency), but that addresses some of what you’re asking about.

2 Likes

Thank you!

Also, just in general terms this is pretty much the main purpose of SnoopCompile. @snoopi measures inference, @snoopc codegen (native code generation), and @snoopr invalidations.

1 Like

One more thing: it looks like DataFrames/CategoricalArrays still invalidates an awful lot of MethodInstances. If you invalidate something that you also depend on, good luck getting precompilation to work.

Even if you don’t add any precompile directives, eliminating invalidations is also probably your easiest way to reduce latency. Anything you invalidated was already compiled, and if some of those MethodInstances are also useful to DataFrames then you’re just going to have to compile them again. If you can avoid that, you save both the inference time and the codegen time. (precompile directives currently only save you inference time; nice, but it’s even better if you can save both.)

3 Likes

The first step is to remove CategoricalArrays.jl as a dependency of DataFrames.jl and we are almost there on master already. Thank you for looking into this.

4 Likes