I am wondering if there is an alternative way to show profiler results (at best in VS Code) than a simple flame graph. What I’m basically looking for is a way to see cumulated time spend in a function/line of code that might have been called multiple times from different functions. So e.g. I have a function
foo which is called from
function_3 - the flame graph will show me how much time was spent for function 1-3, but it’s hard to see when
foo was indeed the bottleneck as it’s split up between the three functions.
I tried to illustrate my problem in my real world profiling results
So I’m wondering if there is a function in these many small bars that occurs many times and where it’s worth to improve performance or which might give me a hint to e.g. if it’s worth using StaticArrays. The overall model is many thousand lines of code.
You can use PProf’s directed graph view,
by using PProf.jl
This may address your problem:
To me it sounds like it should be solved by the builtin
@profview in vscode which has both the flame graph and an inline view.
The inline view can show how many samples were collected for any single row of code IIUC.
Thanks already for the replies! PProfs top-view is what I was searching for. I would really like to see a view like that implemented in the vscode profiler as well.
To revive a dead thread:
I find the flamegraphs baffling, and I “profile” by hand by wrapping chunks of code in “startTime = time()” and “stopTime = time() - startTime” and printing the results.
It would be nice if the profiler’s output looked more like TimerOutputs.jl (which doesn’t actually seem to work with the profiler), or allowed you to drill down like in Spyder:
Right off the bat, I can see that iterativeTukeyRegression() is taking up the most time, and if I drill down, I can see the median() is the culprit:
TimerOutputs is a tracing profiler as opposed to the sampling profiler in the julia stdlib, and the screenshot looks like a tracing profiler as well. For fancier tracing profilers, I’ve had success with NVTX.jl (it works on CPU too, but you need to download nvidia’s software to view the profiles). I think Tracy.jl should work as well (and there is a build option to integrate with the Julia runtime to view time spent within the runtime itself), but I wasn’t able to figure out how to use it / view the right profiles. There is also IntelITT.jl but I haven’t used it.