I saved much time by assigning the Python function to a Julia name.
@btime py"pyaxpy"($y,$x,$a) # n=10
11.699 μs (23 allocations: 1.05 KiB)
pyaxpy = py"pyaxpy"
@btime pyaxpy($y,$x,$a)
4.614 μs (18 allocations: 896 bytes)
Combining with converting the inputs beforehand to PyObjects:
px,py,pa=map(PyObject,(x,y,a))
@btime pyaxpy($py,$px,$pa)
3.563 μs (6 allocations: 224 bytes)
in Python:
%timeit pyaxpy(a,b,c)
1.5 µs ± 9.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Thus, the PyCall function call overhead is approximately 2 μs.