Why does this Python code performs three times faster than Julia?

stevengj · October 13, 2023, 6:22pm

There is a relevant section of the manual (though it talks about function arguments rather than local variables): Argument-type declarations.

In practice, explicit local-variable declarations are rarely used in Julia. Mainly to have finer-grained control over type-promotion/conversion rules (as an alternative to explicit type conversion at each assignment) or to control variable scope. Not for performance per se.

ufechner7 · October 13, 2023, 6:26pm

If you define a struct or an array, then it is worth to use concrete types. Otherwise I don’t do it (unless for defining functions that shall only work on a specific type). In other words: For a local variable, don’t do it.

lmiq · October 13, 2023, 11:59pm

Since you are playing with a MD code, and I may guess you speak Spanish, you can find some elemental implementations here, to play with:

Or other single - file examples here:

github.com

m3g/2021_FortranCon/blob/main/celllistmap_vs_namd/simulate.jl

import Chemfiles
using CellListMap
using FastPow
using StaticArrays
using Printf
using Base.Threads
using Parameters
using Statistics: mean
using LinearAlgebra: norm_sqr

#
# Simulation setup
#
@with_kw struct Params{V,N,T,M,UnitCellType}
    x0::V = getcoor("./ne10k_initial.pdb")  
    temperature::T = 300.
    nsteps::Int = 2_000
    dt::T = 1.0 # fs
    ibath::Int = 10
    print_energy::Int = 50

This file has been truncated. show original

ps: This is how I would/could write that code snippet:

@kwdef struct Params{T}
    σ::T = 1.0
    ϵ::T = 1.0
    m::T = 1.0
    Δt::T = 1.0
end

import FastPow: @fastpow 
force(r, ϵ, σ) = @fastpow -4*ϵ*(-12*σ^12/r^13 + 6*σ^6/r^7)

function md_model2(ʋᵢ::T, rᵢ::T, n; params = Params{T}()) where {T}
    (; σ, ϵ, m, Δt) = params
    rʋ = zeros(T, 2, n)
    for i in axes(rʋ, 2)
        ʋᵢ = ʋᵢ + Δt/2*(force(rᵢ, ϵ, σ)/m)
        rᵢ₊₁ = rᵢ + Δt*ʋᵢ
        ʋᵢ₊₁ = ʋᵢ + Δt/2*(force(rᵢ₊₁, ϵ, σ)/m)
        rʋ[1, i] = rᵢ = rᵢ₊₁
        rʋ[2, i] = ʋᵢ = ʋᵢ₊₁
    end
    return rʋ
end

With which you get:

# original
julia> @btime md_model($ʋ₀, $r₀, $n);
  1.864 s (50000004 allocations: 839.23 MiB)

# code above
julia> @btime md_model2($ʋ₀, $r₀, $n);
  137.313 ms (2 allocations: 76.29 MiB)

Notes:

Having the parameters in a struct gives you flexibility (to change them), without creating performance problems, because then it is practical to pass the parameters to the functions.
Julia is column-major, so better store the consecutive data (coordinates, for instance) in a (2,N) matrix than in a (N,2) matrix. For coordinates, you would normally use the StaticArrays package.
The @fastpow is important for performance in this case (gives you a ~5x speedup). But you can avoid it if you want by splitting the powers in products of small powers.

With these small powers and some additional small optimizations you can get:

julia> @btime md_model3($ʋ₀, $r₀, $n);
  112.524 ms (2 allocations: 76.29 MiB)

code

@kwdef struct Params{T}
    σ::T = 1.0
    ϵ::T = 1.0
    m::T = 1.0
    Δt::T = 1.0
end

# ϵ4 = 4*ϵ
# σ6 = σ^6
function force3(r, ϵ4, σ6)
    r6 = (r^3)^2
    r7 = r6 * r
    r13 = r6 * r7
    return -ϵ4*(-12*σ6^2/r13 + 6*σ6/r7)  
end

function md_model3(ʋᵢ::T, rᵢ::T, n; params = Params{T}()) where {T}
    (; σ, ϵ, m, Δt) = params
    ϵ4 = 4ϵ
    σ6 = σ^6
    Δt2 = Δt/2
    invm = 1/m
    rʋ = zeros(T, 2, n)
    for i in axes(rʋ, 2)
        ʋᵢ = ʋᵢ + Δt2*invm*(force3(rᵢ, ϵ4, σ6))
        rᵢ₊₁ = rᵢ + Δt*ʋᵢ
        ʋᵢ₊₁ = ʋᵢ + Δt2*invm*(force3(rᵢ₊₁, ϵ4, σ6))
        rᵢ = rᵢ₊₁
        ʋᵢ = ʋᵢ₊₁
        rʋ[1, i] = rᵢ₊₁
        rʋ[2, i] = ʋᵢ₊₁
    end
    return rʋ
end

Forgot to mention: with the above code you can also run with Float32 and get some additional speedup:

julia> @btime md_model3(1.f0, 1.f0, $n; params = Params{Float32}());
  92.385 ms (2 allocations: 38.15 MiB)

Topic		Replies	Views
Why is this code so slow in julia compared to a numpy implementation? Performance performance	9	3566	October 24, 2017
Why is python faster than Julia Performance	14	1915	March 12, 2020
Numpy 10x faster than Julia ?! What am I doing wrong ?! [solved - julia faster now] Performance question	37	10955	October 15, 2019
Why is this simple function twice as slow as its Python version Performance question	97	4388	April 12, 2021
Julia is slower than Python when appending elements to untyped arrays General Usage question	42	890	February 10, 2025

Why does this Python code performs three times faster than Julia?

Related topics