Reading the ProfileView Flamegraph

cortner · April 26, 2021, 2:33pm

Below is a flamegraph from a profiling run I just performed. For context, this is not exactly performance critical code, but it cannot be arbitrarily slow either. I would like to make sure I am reading/interpreting this correctly, specifically in connection with the following comment from the documentation:

      Red bars are problematic only when they account for a sizable fraction of the top of a call stack, as only in such cases are they likely to be the source of a significant performance bottleneck

The red indicates type instabilities. My reading of the graph is that there are quite a few, but because they occur quite “low” in the call-stack all the “actual work” is done in a type-stable way. Therefore those type instabilities likely don’t affect the performance too significantly. By contrast if I had lots of red bars near the “top” of that graph, then I would be able to gain a lot by removing those type instabilities.

Do you agree with these statements? If not, I’d be grateful for more comments.

(I’m of course aware that type instabilities are not the only performance problem and there are other things to check too, but this post is just about type instability.) Thank you.

ffevotte · April 26, 2021, 3:13pm

My understanding is that what counts is not really the depth of “red bars” in the call stack. Rather, it’s whether the length of the “red bar” (i.e. the time spent in a function performing dynamic dispatch) is significantly larger than the cumulated lengths of bars stacked immediately upon it (i.e. the time spent in sub calls)

If the length of a “red bar” is completely covered by the lengths of bars stacked on it, it means that most of the time is spent in function calls. However, when a significant fraction of a “red bar” is not covered by any bar on top of it, it might mean that a significant fraction of the time is spent in the dynamic dispatch itself. (it might also mean that the unstable function itself spends some time computing things on its own)

This happens in a few places in your flamegraph. But it does not seem to be to much of an issue. Assuming all of these places are actually 100% multiple dispatch time, then by fixing your code to be type-stable in those places, you could expect the circled lengths to reduce to almost nothing, which would be perhaps a 10-15% gain (by my loose guesstimate)

Raf · April 26, 2021, 3:36pm

@ffevotte pretty much covered this.

But also note that in some cases you can’t really know the performance gain from reducing type instabilities from looking at the flame graph. It may be more than the width of top level red in the graph. In high performance situations, removing code and reducing cache use in one area may make another area faster.

cortner · April 27, 2021, 2:33am

Thank you both - that confirms and clarifies my understanding.

Topic		Replies	Views
Fixing red bars at top of flame profile Performance	12	410	September 6, 2023
Can this performance be improved? Performance	10	636	February 13, 2025
Which profile visualizers show type instability General Usage question , profiling	2	89	January 30, 2025
Significant time spent in type inference in profiler flame graphs Performance question , profiling	4	550	October 22, 2022
What is wrong with my profiling? Performance	4	820	February 20, 2020

Reading the ProfileView Flamegraph

Related topics