Profiling code with heavy IO times

Usually I can get a lot of mileage out of the user-friendly @profview macro and I get all the info I need to spot performance bottlenecks. However, sometimes I have code where there is a lot of IO (e.g., downloading various data over the internet). In these cases, I have noticed that @profview (and I’m assuming the underlying machinery in Profile, seems to not capture the time spent in IO. It presents me with flamegraphs that look something like

I don’t have an MWE I can share, but I’m wondering if anyone else has run into this and has an alternative mechanism for profiling IO-heavy code.

So far, I made some progress from the excellent TimerOutputs.jl package, but it’s a pretty manual process to add all of the various @timeit macros throughout a large package.