Slower summation of a series in Julia than in Python

RayleighLord · March 15, 2022, 12:01pm

Hello,

I was doing some tests in Julia and Python and I found that the following code was much slower in Julia and I do not see why this would be the case

harmonic(k) = (-1)^k / (k + 1)

function sₙ(an, n)
    S = 0.0
    for k in 0:n
        S += an(k)
    end
    return S
end

julia> @time sₙ(harmonic, 500_000)
   0.019135 seconds
0.6931481805570047

In Python I used the code

def harmonic(k):
    return (-1) ** k / (k + 1)

def sum_n(an, n):
    S = 0.0
    for k in range(n+1):
        S += an(k)
    return S

def main():
    start = timeit.timeit()
    sum_n(harmonic, 500000)
    end = timeit.timeit()
    print(abs(end - start))

>>> main()
0.000324424999738713

Any idea on why is this happening?

Sukera · March 15, 2022, 12:15pm

Which julia version are you running? The way your microbenchmark is structured and the output you’re getting suggests you’re accidentally including compilation time of the julia code in your measurement. Try running it a second time in the same session.

Also, since you write -1 in harmonic (a Int literal) instead of -1.0 (a floating point literal) there may be some conversion cost as well.

lmiq · March 15, 2022, 12:19pm

Indeed, just changing (-1)^k to (-1.0)^k makes the code faster.

RayleighLord · March 15, 2022, 12:19pm

Im running Julia 1.7. The compilation time is not included since I executed @time a second time.

However, I just tried your second suggestion and you are absolutely right. I thought that there was no cost doing (-1)^k over (-1.0)^k but it seems I got that wrong.

lmiq · March 15, 2022, 12:23pm

Seem to be even faster if you use: harmonic(k) = ifelse(isodd(k) , -1 , 1) / (k + 1)

anowacki · March 15, 2022, 12:31pm

timeit.timeit does not work this way. Each call to timeit.timeit returns the time taken to run the command given to timeit.timeit, which in your case with timeit.timeit() is no command. So your main simply returns the difference between two calls to timeit.timeit, each of which gives the time taken to run no command at all. This probably explains why sometimes the difference is negative and you have used abs to ignore that.

To correctly calculate the time taken to run sum_n(harmonic, 500000), you need to do:

import timeit

timeit.timeit("sum_n(harmonic, 500000)", setup="""
def harmonic(k):
   return (-1) ** k / (k + 1)

def sum_n(an, n):
   S = 0.0
   for k in range(n+1):
       S += an(k)
   return S
""", number=1)

Note however that timeit.timeit by default uses many evaluations (a million) of the expression you supply, so the above would take a very long time to return. Use the number argument to timeit.timeit to get an answer in an acceptable time.

Doing that for me gives about 250 ms for the Python code above. So much slower than your original Julia.

rvasil · March 15, 2022, 12:43pm

Not a performance tip, but you can simplify sₙ function like this
sₙ(an, n) = sum(an(k) for k in 0:n)

lawless-m · March 15, 2022, 12:43pm

I can get the wrong answer even quicker !

julia> @btime sₙ(harmonic, 500_000)
  648.000 μs (0 allocations: 0 bytes)
0.6931481805570047

julia> function sₙ(an, n)
    S = 0.0
    for k in 0:n
        @fastmath S += an(k)
    end
    return S
end

julia> @btime sₙ(harmonic, 500_000)
  444.600 μs (0 allocations: 0 bytes)
0.6931481805569676

DNF · March 15, 2022, 4:18pm

Yes, definitely do this instead of (-1.0)^k, or (-1)^k. Repeatedly calculating a power for gradually increasing exponents is needlessly wasteful. It’s nothing to do with conversion cost, just that (-1)^500_000 needs to do approximately log2(500_000) = 19 multiplications, for each iteration. It doesn’t really make any sense to redo all of those calcualations over and over.

The same goes for Python, use some version of isodd, or keep track of the previous value of (-1)^k and get the next by just multiplying by -1.

This is a common performance pitfall. Too often you can find code that implements something like \sum_{k=1}^N x^k/k! dutifully by

function slowsum(x, n)
    val = 0.0
    for i in 1:n
        val += x^i / factorial(i)
    end
    return val
end

instead of

function fastsum(x, n)
    val = 0.0
    part = 1.0
    for i in 1:n
        part *= x / i
        val += part
    end
    return val
end

with results like these:

1.7.2> @btime slowsum(x, 15) setup=(x=2+rand())
  377.451 ns (0 allocations: 0 bytes)
15.747545023566579

1.7.2> @btime fastsum(x, 15) setup=(x=2+rand())
  12.613 ns (0 allocations: 0 bytes)
10.771366140823954

Not to mention the overflow behaviour of slowsum.

stevengj · March 15, 2022, 4:36pm

The FastPow.jl package will do this transformation for you on (-1)^k if you just put @fastpow in front of your function.

Note that it’s generally even better to compute each term in a series by iteratively updating the previous term, like this:

S = 0.0
term = 1
for k = 0:n
    S += term / (k+1)
    term = -term
end

or you could even unroll:

S = 0.0
for k = 0:2:n
    S += inv(k+1) - inv(k+2)
end

These kinds of tricks become even more important for accumulating series where the terms are more complicated, e.g. `\sum_{k=0}^n z^k / k!:

S = 0.0
term = one(z) / 1
for k = 0:n
    S += term
    term *= z / (k+1)
end

DNF · March 15, 2022, 4:44pm

That’s some coincidence, you chose the exact same example as I did, while we were simultaneously editing😁

Topic		Replies	Views
Julia speed Performance question	28	2263	March 14, 2024
Question on simple performance comparison between Python and Julia General Usage question	23	1043	June 13, 2023
General questions from Python user Performance	59	4322	March 8, 2021
Why the function sum1 is faster than builtin sum Performance question , profiling	14	1041	September 21, 2023
Optimized Python is as good as Julia Performance question	28	25301	June 2, 2025

Slower summation of a series in Julia than in Python

Related topics