Does @benchmark accurately track PyCall allocations?

In the few instances I have tried to compare PyCall+@benchmark versus timteit within Python, I get runtimes that are very similar (occasionally timing Python from Julia is 1-2 ms faster for some reason). But, does @benchmark accurately track memory use and allocations when used with PyCall?

Consider this:

using BenchmarkTools
using PyCall

np = pyimport("numpy")

a = rand(1_000_000)

@benchmark sin.($a)
@benchmark np.sin($a)

----------------------------------

julia> @benchmark sin.($abc)
BenchmarkTools.Trial: 
  memory estimate:  781.33 KiB
  allocs estimate:  2
  --------------
  minimum time:     672.060 μs (0.00% GC)
  median time:      674.188 μs (0.00% GC)
  mean time:        703.750 μs (1.67% GC)
  maximum time:     3.737 ms (78.29% GC)
  --------------
  samples:          7100
  evals/sample:     1

julia> @benchmark np.sin($abc)
BenchmarkTools.Trial: 
  memory estimate:  782.94 KiB
  allocs estimate:  32
  --------------
  minimum time:     844.317 μs (0.00% GC)
  median time:      883.844 μs (0.00% GC)
  mean time:        911.998 μs (1.41% GC)
  maximum time:     3.835 ms (52.47% GC)
  --------------
  samples:          5477
  evals/sample:     1

The memory estimate is very similar, but Julia seems to make fewer allocations. Can I trust these results?

It will track memory made by the Julia runtime. This will not include eventual allocations made by the Python runtime.

2 Likes

Okay, that’s what pretty much what I had figured. I just wanted to be sure since I don’t have a solid understanding of how @benchmark works. Thanks!