Profiling code reveals that most time is used calling 3 functions in task.jl

gobs · November 5, 2019, 3:11pm

I profiled my call to a function and got the following plot:

Those first 3 long horizontal lines are calls to task_done_hook, then wait, then poptaskref (in ascending order). This happens even for the 2nd or 3rd call to my function, and I don’t understand what they could be. I stopped specifying a lot of the types used / returned in my functions after a while since I was losing time doing so properly - could this be the cause? How could I find out?

mauro3 · November 5, 2019, 3:18pm

Did you do Profile.clear() inbetween?

I stopped specifying a lot of the types used / returned in my functions after a while

This should not impact performance.

kristoffer.carlsson · November 5, 2019, 3:38pm

You can also give C = true to ProfileView to see if there are C-frames hiding in the dark… eeh I mean in the light.

gobs · November 5, 2019, 4:46pm

I used @profview, which includes the profile clear command:
@profview f(args...) is just shorthand for Profile.clear(); @profile f(args...); ProfileView.view()

gobs · November 5, 2019, 4:54pm

I did that, and now I see that the last function which is called (topmost line on the left hand side) is libpthread.so.0, pthread_cond_wait: line -1; - any help interpreting that?

EDIT: There’s also a function call before that which is marked in red (which is bad as I understand): gf.c, jl_apply_generic: line 2191

Pbellive · November 5, 2019, 5:19pm

Are you running multiple julia threads? I found similar results when profiling anything if julia was launched with more than one thread. Profiling seems to work properly for me only when julia is launched with a single thread. If you’re using Juno, note that it launches julia with multiple threads by default. See the following, for example:

https://github.com/JunoLab/Juno.jl/issues/205

kristoffer.carlsson · November 5, 2019, 5:23pm

I also remember getting that big bar when running things with multiple threads a long time ago.

gobs · November 6, 2019, 8:23am

Huh… If that’s the case, I guess my code really is just quite slow then! Will report back

gobs · November 6, 2019, 8:39am

So I set ENV["JULIA_NUM_THREADS"] = 1 in the REPL, ran my script and I get the same problem. I added the line @show ENV["JULIA_NUM_THREADS"] in my script and it correctly shows that I’m using 1 thread.

However, I also just wrote a bunch of print statements in my function to see how long it would take them to appear and it seems that my function really is just slow (it makes an optimisation problem so it takes about 5 seconds to run) But at least now I know

LaurentPlagne · November 6, 2019, 10:56am

I have the same “kind” of problem.

Setting the thread number to 1 removes a spurious pink/salmon/ping part in the profile view.
Once the thread number is set to 1, there is still a pink/red/pink “spurious” part that does not correspond to real time (IMHO) because the measured time is proportional to my real iteration number.
One thing that could be nice is to add a scale on the profile view axis (e.g. time in second for x axis)

pink

 /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/REPL/src/REPL.jl, macro expansion: line 118

red

  ./task.jl, #26: line 268
/buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/REPL/src/REPL.jl, eval_user_input: line 86

pink

./boot.jl, eval: line 330

Profile with thread=1

Profile with thread=4

Note that the actual (@time) time is always the same (1.6 seconds)

LaurentPlagne · November 6, 2019, 11:02am

OK, I have found that this spurious thing does appear when I use Juno. From the Julia REPL I have the following (correct) picture

Tamas_Papp · January 14, 2020, 4:34pm

I can replicate this problem with

julia 1.3, also julia 1.4 branch

versions

(@v1) pkg> st --manifest ProfileView
Status `~/.julia/environments/v1/Manifest.toml`
  [159f3aea] Cairo v1.0.2
  [5ae59095] Colors v0.11.2
  [5789e2e9] FileIO v1.2.1
  [a2bd30eb] Graphics v1.0.1
  [4c0ca9eb] Gtk v1.1.2
  [27996c0f] GtkReactive v1.0.2
  [8197267c] IntervalSets v0.3.2
  [c46f51b8] ProfileView v0.5.3
  [b77e0a4c] InteractiveUtils 
  [9abbd945] Profile 
  [cf7118a7] UUIDs

environment variable JULIA_NUM_THREADS set to 4 or 8.
Linux

Setting the latter to 1 fixes things.

I am not sure where to report this, Julia or ProfileView?

kristoffer.carlsson · January 14, 2020, 4:44pm

Because Juno sets the number of Julia CPUs.

I think profiling threaded code just doesn’t work well right now.

bjarthur · July 23, 2021, 4:16pm

agreed. profiling threaded code gives strange results. all my time is reported to be spent in task_done_hook.

n8xm · July 26, 2021, 10:19pm

I have had the same experience. I’ve found that setting the number of threads to be 1 is necessary to avoid spurious results in the profiler.

Should this be reported as an issue in Julia’s standard library, specifically Profile.jl? The Juno issue mentioned by @Pbellive ( https://github.com/JunoLab/Juno.jl/issues/205 ) is now closed, with the comment that in Juno, “we [now] filter profile traces much better.”

Outside of Juno, though, this still appears to be an issue, and it’s not obvious to a newcomer that multiple threads are the culprit.

Tamas_Papp · July 27, 2021, 8:04am

I think it would be worth opening an issue at ProfileView.jl.

Skoffer · July 27, 2021, 10:13am

Or even better in https://github.com/timholy/FlameGraphs.jl

Tamas_Papp · July 27, 2021, 11:20am

Done:

~~https://github.com/timholy/FlameGraphs.jl/issues/40~~

https://github.com/JuliaLang/julia/issues/41713

mauro3 · September 26, 2021, 11:07am

This is now fixed with https://github.com/JuliaLang/julia/pull/41742. Which, I think, will feature in Julia 1.8.

Damiano · March 2, 2025, 9:03pm

I have the same problem today on a Pluto notebook with Julia 1.11.3. If I just do Pluto.run(), the notebook starts with some threads (more than 1, as this is the default behaviour). Doing using ProfileCanvas and then @profview on my function gives me that 83% of the samples are in task_done_hook, wait, poptask , similarly to what was happening in the original question asked. The remaining 17% is given by #1, eval, run_expression. If instead I run Pluto with Pluto.run(threads=1), then everything is given by eval. The result of @time is the same in both cases (Pluto launched with many threads or Pluto launched with a single thread), again as in the original question asked.
Does anybody know if this is a different issue, or a regression, or else?
Of course I am happy to provide more info on my system and on my specific code if you think this could help.

Topic		Replies	Views
Why is the profile view so scrunched up to the right? General Usage	7	478	April 5, 2022
How to understand the ProfileView output Performance question , package , plotting , profiling	3	1838	April 29, 2022
Profile results contains no samples from my code General Usage profile	2	85	May 6, 2025
Profiler doesn't show Threads New to Julia multithreading , profiling	6	537	February 6, 2024
Why is my root process so much longer than my other processes? Profiling	3	134	November 18, 2024

Profiling code reveals that most time is used calling 3 functions in task.jl

Related topics