Minimizing PythonCall overhead with BenchmarkTools

Hi there!
I’m generating benchmarks with BenchmarkTools.jl, where a Python package is called with PythonCall.jl. My question is the following: how can I minimize the overhead of calling a method meth from a Python object obj?
Which of these options is the most efficient, or have I perhaps missed a better one? Should I just time from within Python?

using BenchmarkTools, PythonCall
# define obj and x
@btime $(obj).meth($x)
@btime $(obj.meth)($x)
@btime pycall($(obj).meth, $x)
@btime pycall($(obj.meth), $x)

Related topic (does the advice for PyCall.jl also apply here?):

1 Like

You could just try a few cases snd check if one method returns alwsys the lowest timing :slight_smile:

It seems like option 4 is the best one when benchmarking from Julia. Always interpolate with the object method

Julia benchmark
julia> using BenchmarkTools, PythonCall

julia> @pyexec """
       class Fib:
           def __init__(self, n):
               self.n = n
       
           def meth(self):
               a, b = 0, 1
               for i in range(self.n):
                   a, b = b, a+b
               return a
       """ => Fib
Python: <class 'Fib'>

julia> obj_jl = Fib(10)
Python: <Fib object at 0x7fcb883f1f50>

julia> @btime $(obj_jl).meth();
  395.060 ns (4 allocations: 72 bytes)

julia> @btime $(obj_jl.meth)();
  279.845 ns (1 allocation: 16 bytes)

julia> @btime pycall($(obj_jl).meth);
  391.574 ns (4 allocations: 72 bytes)

julia> @btime pycall($(obj_jl.meth));
  278.197 ns (1 allocation: 16 bytes)

However, benchmarking from Python with timeit is faster:

Python benchmark
julia> using PythonCall

julia> @pyexec """
       import timeit
       import numpy as np
       
       class Fib:
           def __init__(self, n):
               self.n = n
       
           def meth(self):
               a, b = 0, 1
               for i in range(self.n):
                   a, b = b, a+b
               return a
       
       obj_py = Fib(10)
       times_py = np.array(timeit.repeat("obj_py.meth()", globals=locals(), number=100, repeat=1000)) / 100
       """ => times_py;

julia> minimum(times_py)  # result in seconds
Python: 2.6893000040217883e-07

Option two does not measure the execution time.
You execute the function once as setup and interpolate the result into the expression.

The fastest of the “actual” benchmarks seems to be no 4.

1 Like

My bad, I misplaced the parentheses. Will edit

1 Like