Strange timing issue with code

julia> @btime p = LightGraphs.dfs_parents(g, 2);
  210.919 ms (3670024 allocations: 216.13 MiB)

julia> @btime LightGraphs.tree(p);
  270.824 ms (5767200 allocations: 410.00 MiB)

julia> @btime LightGraphs.tree(LightGraphs.dfs_parents(g, 2))
  1.398 s (9437224 allocations: 626.13 MiB)

I’m missing something here, but why is the elapsed time so much greater for the combined functions than running them in series? Note that the allocations appear to be correctly cumulative.

ETA: @time is a bit more instructive, I think:

julia> @time p = LightGraphs.dfs_parents(g,2);
  0.214946 seconds (3.67 M allocations: 216.126 MiB)

julia> @time LightGraphs.tree(p);
  0.300225 seconds (5.77 M allocations: 410.001 MiB)

julia> @time LightGraphs.tree(LightGraphs.dfs_parents(g, 2));
  4.214631 seconds (9.44 M allocations: 626.127 MiB, 85.10% gc time)

Lots of GC time, and much worse combined performance. Is there a way to optimize away the garbage collector?

Edit 2:
Calling gc() explicitly before the combined function fixes things:

julia> @time p = LightGraphs.dfs_parents(g,2);
  0.486441 seconds (3.69 M allocations: 217.065 MiB, 35.11% gc time)

julia> @time LightGraphs.tree(p);
  1.181900 seconds (5.78 M allocations: 410.747 MiB, 51.94% gc time)

julia> gc()

julia> @time LightGraphs.tree(LightGraphs.dfs_parents(g, 2))
  1.566697 seconds (9.44 M allocations: 626.127 MiB, 67.12% gc time)

You need to use $ to interpolate the values of global variables with at-btime.

@dpsanders - Thanks. I did that already, but I’m still getting similar timings. It appears to be gc-related, but I don’t quite know how to keep it from happening.

I don’t think you have any other resort than looking into ways of reducing your allocations. It may be possible to tweak exactly when garbage collection happens but at some point you have to pay the price for producing lots of garbage.

(May or may not be related to the issue here but) just to be clear, producing garbage doesn’t have a price (other than the allocation time), producing useful allocation does…

I’m still not sure why combining two functions f(g(x)) shows such poor timing relative to a = g(x); f(a).