Julia vs. Python: Euler's sum of powers conjecture

The Julia code on this page is much slower than the Python examples.
Euler’s sum of powers conjecture with code examples
I wrote a Julia program similar to the fastest Python example:
Julia

function main()
  MAX = 250  
  p5 = Dict()
  sum2 = Dict()
  sk = []

  for i in range(1, MAX)
       p5[i^5] = i
       for j in range(i, MAX)
               sum2[i^5 + j^5] = (i, j)
       end
  end

  sk = sort(collect(keys(sum2)))

  for p in sort(collect(keys(p5)))
       for s in sk
               if p <= s ; break ; end
               if  p - s in keys(sum2)
                       println(p5[p], sum2[s],  sum2[p-s])   
#                       exit()
               end
       end
  end
end

main()

~]$ time julia euler2.jl
144(27, 84)(110, 133)
144(27, 110)(84, 133)
144(84, 110)(27, 133)
144(27, 133)(84, 110)
144(84, 133)(27, 110)
144(110, 133)(27, 84)

real 0m0.996s
user 0m0.982s
sys 0m0.222s

Python

MAX = 250
p5, sum2 = {}, {}
sk = []

for i in range(1, MAX):
       p5[i**5] = i
       for j in range(i, MAX):
               sum2[i**5 + j**5] = (i, j)
 
sk = sorted(sum2.keys())

for p in sorted(p5.keys()):
       for s in sk:
               if p <= s: break
               if p - s in sum2:
                       print(p5[p], sum2[s] + sum2[p-s])
#                       exit()

]$ time python euler2.py
144 (27, 84, 110, 133)
144 (27, 110, 84, 133)
144 (84, 110, 27, 133)
144 (27, 133, 84, 110)
144 (84, 133, 27, 110)
144 (110, 133, 27, 84)

real 0m0.559s
user 0m0.546s
sys 0m0.009s

I was surprised that the speed of these basic loop algorithm differed by almost a factor of two.
The Python can also be made faster by putting the nested loop into a method, like so:

MAX = 250
p5, sum2 = {}, {}
sk = {}
for i in range(1, MAX):
    p5[i**5] = i
    for j in range(i, MAX):
        sum2[i**5 + j**5] = (i, j)
 
sk = sorted(sum2.keys())
def methodC ():
    for p in sorted(p5.keys()):
        for s in sk:
            if p <= s: break
            if p - s in sum2:
                yield p5[p], sum2[s] + sum2[p-s]

for v1, v2 in methodC ():
    print (v1, v2)

~]$ time python euler3.py
144 (27, 84, 110, 133)
144 (27, 110, 84, 133)
144 (84, 110, 27, 133)
144 (27, 133, 84, 110)
144 (84, 133, 27, 110)
144 (110, 133, 27, 84)

real 0m0.368s
user 0m0.351s
sys 0m0.014s

Does anyone have any suggestions on how to optimize this Julia program to make it faster?

I guess the question is better suited for the performance category. You can just add a tag to your question. :slight_smile:

haskey(sum2, p - s) should be faster, O(1) vs O(n).

You are measuring Julia startup time and compilation time. That may be fair in your case, but fast startup is not where Julia excels. Benchmarking is usually done with @btime to show the best case.

With @btime on a slower computer, need to use same system for comparison with above results
~]$ julia euler2.jl
144(27, 84)(110, 133)
144(27, 110)(84, 133)
144(84, 110)(27, 133)
144(27, 133)(84, 110)
144(84, 133)(27, 110)
144(110, 133)(27, 84)
0.649362 seconds (8.53 M allocations: 140.497 MiB, 6.06% gc time, 20.67% compilation time)

And changing to haskey
~]$ julia euler2.jl
144(27, 84)(110, 133)
144(27, 110)(84, 133)
144(84, 110)(27, 133)
144(27, 133)(84, 110)
144(84, 133)(27, 110)
144(110, 133)(27, 84)
0.612808 seconds (6.04 M allocations: 102.580 MiB, 4.02% gc time, 21.72% compilation time)

~30 ms faster

You should check the suggestion from @math_opt. A simple definition of types:

function main()
    MAX = 250
    p5 = Dict{Int,Int}()
    sum2 = Dict{Int,Tuple{Int, Int}}()
    sk = Int64[]

    for i in range(1, MAX)
        p5[i^5] = i
        for j in range(i, MAX)
            sum2[i^5 + j^5] = (i, j)
        end
    end

    sk = sort(collect(keys(sum2)))

    @inbounds for p in sort(collect(keys(p5)))
        for s in sk
            if p <= s ; break ; end
            if  p - s in keys(sum2)
                println(p5[p], sum2[s],  sum2[p-s])   
                #                       exit()
            end
        end
    end
end

main()

Give me similar results than using Python.

time python euler.py
144 (27, 84, 110, 133)
144 (27, 110, 84, 133)
144 (84, 110, 27, 133)
144 (27, 133, 84, 110)
144 (84, 133, 27, 110)
144 (110, 133, 27, 84)

real	0m0.696s
user	0m0.686s
sys	0m0.009s

And using Julia:

time julia euler2.jl 
144(27, 84)(110, 133)
144(27, 110)(84, 133)
144(84, 110)(27, 133)
144(27, 133)(84, 110)
144(84, 133)(27, 110)
144(110, 133)(27, 84)

real	0m0.625s
user	0m0.695s
sys	0m0.257s

However, the source code can be improved a lot.

1 Like

I’ve moved it to performance.

For OP’s reference, that’s the fifth tip here Performance Tips · The Julia Language.

Note that types don’t have to be specified by hand. For example, the for-loop

  p5 = Dict()
  sum2 = Dict()
  sk = []

  for i in range(1, MAX)
       p5[i^5] = i
       for j in range(i, MAX)
               sum2[i^5 + j^5] = (i, j)
       end
  end

can be replaced with

p5 = Dict([i^5 => i for i in 1:MAX])
sum2 = Dict([i^5 + j^5 => (i, j) for i in 1:MAX for j in i:MAX])

x in keys(dict) is O(1). The keys(dict) object is just a lazy wrapper, and has a specialized in method that is equivalent to haskey.

2 Likes

(Using sort! here would eliminate a temporary array.)

If you want to sort dictionaries by keys, you might consider using a SortedDict data structure. However, in this case, since you only need to sort it once, I tried it and it is faster just to use an ordinary Dict and sort! at the end.

This constructs an array, then copies it to a dictionary, then throws away the array. You can instead do p5 = Dict(i^5 => i for i in 1:MAX) to construct the dictionary directly with no intermediate array.

For p5 - yes, it’s likely better to use the generator.
But for sum2, creating the array first

sum2 = Dict([i^5 + j^5 => (i, j) for i in 1:MAX for j in i:MAX])

is better for the function as a whole. This way is typestable, while

sum2 = Dict(i^5 + j^5 => (i, j) for i in 1:MAX for j in i:MAX)

is not, and sum2 wouldn’t fully infer in the remaining part of the function. Maybe, this will be made more inferrable in a future julia version…

Good to know. Thanks. I read a doc string for keys that said it returned an iterator. I must not have looked closely enough to find the specialized method.

Yes, you have to do:

sum2 = Dict{Int,Tuple{Int,Int}}(i^5 + j^5 => (i, j) for i in 1:MAX for j in i:MAX)

I had already tried the SortedDict and found that it was slower. I will compare sort with sort!.

function main(MAX=250)
    p5 = Dict(i^5=>i for i in 1:MAX)
    sum2 = Dict{Int,Tuple{Int,Int}}((i^5 + j^5) => (i, j) for i in 1:MAX, j in 1:MAX)
    
    sk = sort!(collect(keys(sum2)))
    for p in sort!(collect(keys(p5)))
        for s in sk
            p <= s && break
            if  p - s in keys(sum2)
                println(p5[p], sum2[s], sum2[p-s])   
            end
        end
    end
    return nothing
end
julia> main()
144(27, 84)(110, 133)
144(27, 110)(84, 133)
144(84, 110)(27, 133)
144(27, 133)(84, 110)
144(84, 133)(27, 110)
144(110, 133)(27, 84)
julia> @btime main()
⋮ (omitted output)
48.210 ms (362 allocations: 3.39 MiB)
1 Like

The last response contained all of the previous suggestions to which I added a test
to not print the duplicate results:

function main(MAX=250)
   p5 = Dict(i^5 => i for i in 1:MAX)
   sum2 = Dict{Int,Tuple{Int,Int}}((i^5 + j^5) => (i, j) for i in 1:MAX, j in 1:MAX)
   sk = sort!(collect(keys(sum2)))

   p5old=0
   for p in sort!(collect(keys(p5)))
        for s in sk
                p <= s && break
                if  haskey(sum2, p - s)
                        if p5[p] != p5old
                           println(p5[p], sum2[s],  sum2[p-s])
                           p5old=p5[p]
                        end
                end
        end
   end
   return nothing
end

@time  main()
@time  main()

~]$ julia euler4.jl
144(27, 84)(110, 133)
0.107246 seconds (96.74 k allocations: 8.776 MiB, 68.72% compilation time)
144(27, 84)(110, 133)
0.033693 seconds (75 allocations: 3.378 MiB)

For a comparison with python:

from time import perf_counter

tBeg = perf_counter ()

MAX = 250
p5, sum2 = {}, {}
 
for i in range(1, MAX):
    p5[i**5] = i
    for j in range(i, MAX):
        sum2[i**5 + j**5] = (i, j)
 
sk = sorted(sum2.keys())
def methodC ():
    p5old=0
    for p in sorted(p5.keys()):
        for s in sk:
            if p <= s: break
            if p - s in sum2:
                if p5[p] != p5old:
                   yield p5[p], sum2[s] + sum2[p-s]
                   p5old = p5[p]

for v1, v2 in methodC ():
    print (v1, v2)
tEnd = perf_counter ()
print ("D", "MAX", MAX, "In", tEnd - tBeg, "seconds")

~]$ python euler4.py
144 (27, 84, 110, 133)
D MAX 250 In 0.3231038339999941 seconds
Not sure if this Python is optimized, but it is slower by an order of magnitude.
I will have to send these result to one of my friends who is a Python expert to see if they can
squeeze more performance out of python.

Thanks to everyone who responded, The discussion was enlightening.
I have much more to learn about using Julia efficiently.

2 Likes