"Very optimised" Julia code 50x slower than Python with compiled functions

Hi All,
I am doing some Monte Carlo computation using Python (compiling some functions to C++ using the Numba package).
I thought I would see how Julia compares.
As is good programming practice, I scraped some code from the Internet Ising 2D, simulating the same 2D square Ising model I am into. The code is supposedly “very optimised” already.
The fact is, the Python code is almost 50 times faster, two seconds vs approx one hundred. The Python code it is not optimised at all, and it does perform additional operations such as appending a values to string for every iteration. They both perform a Metropolis-Hasting iteration for one million iterations, over a 100x100 lattice.
I am totally at a loss. I fear I am missing something silly, but I have been checking all morning.
This is the Julia code

using BenchmarkTools, StaticArrays
using Random: default_rng, seed!

function ising2d_ifelse!(s, β, niters, rng=default_rng())
    m, n = size(s)
    min_h = -4
    max_h = 4
    prob = [1/(1+exp(-2*β*h)) for h in min_h:max_h]
    for iter in 1:niters
        for j in 1:n 
            for i in 1:m
                NN = s[ifelse(i == 1, m, i-1), j]
                SS = s[ifelse(i == m, 1, i+1), j]
                WW = s[i, ifelse(j == 1, n, j-1)]
                EE = s[i, ifelse(j == n, 1, j+1)]
                h = NN + SS + WW + EE
                s[i,j] = ifelse(rand(rng) < prob[h-min_h+1], +1, -1)
            end
        end
    end
end

const β_crit = log(1+sqrt(2))/2
#rand_ising2d(m, n=m) = rand(Int8[-1, 1], m, n)
rand_ising2d(m, n=m) = fill(1,(m,m))

seed!(4649)
s₀ = rand_ising2d(100);

s = copy(s₀)
rng = default_rng()
ising2d_ifelse!(s, β_crit, 10^3, rng)

s = copy(s₀)
steps = 1000000
seed!(4649)
rng = default_rng()
#@benchmark ising2d_ifelse!(s, β_crit, 10^6, rng)
#@time begin
#  ising2d_ifelse!(s, β_crit, 1000000, rng)
#end

time_ising(s, β_crit, steps, rng) = @time  ising2d_ifelse!(s, β_crit, steps, rng)
time_ising(s, β_crit, steps, rng)

and its (compiled) Python counterpart

import numpy as np
from math import exp
from random import randrange, random, choice
from numba import jit

from datetime import datetime
startTime = datetime.now()


@jit(nopython= True)
def init_lowT(N): 
        return  np.ones((N,N))
@jit(nopython= True)        
def init_highT(N):
        l = np.zeros((self.N,self.N),dtype=int)
        for y in range(self.N):
            for x in range(self.N):
                l[x,y] = choice([1,-1])
        return l
    

@jit(nopython= True)
def E_elem(array,x,y, B): 
    N = array.shape[0]
    if B==0: #no magnetic field
        return (-1.0 * array[x,y]*(array[(x + 1) % N, y] +
                                            array[(x - 1 + N) % N, y] +
                                            array[x, (y + 1) % N] +
                                            array[x, (y - 1 + N) % N]))
    else: #there is a magnetic field
        return (-1.0 * array[x,y]*(array[(x + 1) % N, y] +
                                            array[(x - 1 + N) % N, y] +
                                            array[x, (y + 1) % N] +
                                            array[x, (y - 1 + N) % N]) -
                                            array[x,y]*B)


# sums up potential of lattice
@jit(nopython= True)
def E_tot(array, B):
   N = array.shape[0]
   en = 0
   for y in range(N):
       for x in range(N):
           en += E_elem(array, x,y,B)
   return en

# sums up magnetization of lattice
@jit(nopython= True)
def M_tot(array):
   N = array.shape[0]
   mag = 0
   for y in range(N):
       for x in range(N):
           mag += array[x,y]
   return mag


@jit(nopython= True)
def metropolis(array, T, steps, B):
    N = array.shape[0]
    Elist = []
    Mlist = []
    M = M_tot(array)
    E = E_tot(array, B)
    for i in range(steps):          
    # choose random atom
        x = randrange(N)
        y = randrange(N)

        dE = -2* E_elem(array,x,y,B)

        if (dE <= 0):
            array [x,y] *= -1
            M += 2 * array[x,y]
            E += dE
            Mlist.append(M)
            Elist.append(E)

        elif random() < exp(-1.0 * dE/T):
            array[x,y] *= -1
            M += 2 * array[x,y]
            E += dE
            Mlist.append(M)
            Elist.append(E)
#        Mlist = Mlist[::100]
#       Elist = Elist[::100]
    return array, Mlist, Elist


@jit(nopython= True)
def run_sim(N=100,T=1.3, steps = 500000, B=-1):
    start_conf = init_lowT(N)
    res, Mlist, Elist = metropolis(start_conf, T, steps, B)
#    M.metropolis(steps = steps)
# plt.imshow(res,cmap="bwr")
#    plt.colorbar()
    return  res, Mlist, Elist

T_hat = 1.0*2.26

res, Mlist, Elist = run_sim(N=128,T=T_hat, steps = 1000000, B=0)

print(datetime.now() - startTime)

The timing might be quite rudimental in both cases, but the difference is so astonishing (approx. 2 sec vs 100 sec) I do not think it matters. Any hint or ideas please?

Are not the two codes completely different?

In Julia you have a for j in 1:n and for i in 1:m inside for iter in 1:niters. In Python you select a single index in the range 1:N to do something over it. The Julia algorithm is cubic, the Python algorithm is linear.

4 Likes

The python code picks a random lattice site (metropolis algorithm) at every step.

 x = randrange(N)
 y = randrange(N)

whereas the julia code uses all lattice sites at every step

for iter in 1:niters
        for j in 1:n 
            for i in 1:m

This is not correct and at the at the resolution that you tested it is also 10000x more work.

function ising2d_ifelse!(s, β, niters, rng=default_rng())
    m, n = size(s)
    min_h = -4
    max_h = 4
    prob = [1/(1+exp(-2*β*h)) for h in min_h:max_h]

    for iter in 1:niters
        j = rand(rng, 1:n)
        i = rand(rng, 1:m)
        NN = s[ifelse(i == 1, m, i-1), j]
        SS = s[ifelse(i == m, 1, i+1), j]
        WW = s[i, ifelse(j == 1, n, j-1)]
        EE = s[i, ifelse(j == n, 1, j+1)]
        h = NN + SS + WW + EE
        s[i,j] = ifelse(rand(rng) < prob[h-min_h+1], +1, -1)
    end
end

this takes 0.027s on my computer

5 Likes

Goodness gracious, I totally missed that part. I gave it for granted it would do the standard Metropolis flip, indeed it goes through all sequentially, at every iterations.
My most sincere apologies. At least, it gets confirmed once more Julia crashed Python + Numba. Thanks again

5 Likes