Actually your python’s functions look like:
def f(r, x1, x2, x3, x4, x5, x6, x7, x8):
r = x1 + x
which very probably doesn’t do what is intended. (it just rebind local variable r
inside function)
But it seems you could fix that and it bring no big impact to benchmarks:
%timeit l(result, *x)
11.5 ms ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
def ll(x1, x2, x3, x4, x5, x6, x7, x8):
return x1 + x2 - x3 + x4 - x5 + x6 - x7 + x8
%timeit r = ll(*x)
11.5 ms ± 112 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
For comparison numba version:
@numba.jit
def l2(x1, x2, x3, x4, x5, x6, x7, x8):
return x1 + x2 - x3 + x4 - x5 + x6 - x7 + x8
%timeit r = l2(*x)
5.52 ms ± 33.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
I was trying to add parallelism but maybe I have too few (2 ) cores now
os.environ["NUMBA_DEBUG_ARRAY_OPT_STATS"]='1'
@numba.jit('double[:](double[:],double[:],double[:],double[:],double[:],double[:],double[:],double[:],)', nopython=True, parallel=True)
def l2p(x1, x2, x3, x4, x5, x6, x7, x8):
return x1 + x2 - x3 + x4 - x5 + x6 - x7 + x8
Parallel for-loop #23 is produced from pattern '('arrayexpr (((((((_+_)-_)+_)-_)+_)-_)+_)',)' at <ipython-input-125-c699c2d45b59> (3)
After fusion, function l2p has 1 parallel for-loop(s) #{23}.
%timeit r = l2p(*x)
5.51 ms ± 128 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)