The following code contains two functions that compute Lennard-Jones potentials, U(r) = 4\varepsilon \left(\frac{\sigma^{12}}{r^{12}} - \frac{\sigma^6}{r^6}\right) for all pairs of a set of 1000 particles.
The first function is the “obvious” one, which directly translates the formula above. In the second one I manually optimize the computation of the powers of \sigma and r .
The difference is way too large:
77.724 ms (0 allocations: 0 bytes)
1.802 ms (0 allocations: 0 bytes)
Code
using StaticArrays, LinearAlgebra, BenchmarkTools, Test
function lj1(x)
eps = 2.0
sig = 1.0
u = 0.
for i in 1:length(x)-1
for j in i+1:length(x)
r = norm(x[j] - x[i])
u += 4*eps*(sig^12/r^12 - sig^6/r^6)
end
end
return u
end
function lj2(x)
eps = 2.0
sig = 1.0
sig6 = sig^6
sig12 = sig6^2
u = 0.
for i in 1:length(x)-1
for j in i+1:length(x)
r = norm(x[j] - x[i])
r3 = r^2*r
r6 = r3^2
r12 = r6^2
u += 4*eps*(sig12/r12 - sig6/r6)
end
end
return u
end
x = rand(SVector{2,Float64},1000)
@test lj1(x) ≈ lj2(x)
@btime lj1($x)
@btime lj2($x)
I would not be very concerned about this if this was actually a very performance-critical code, manually optimizing those exponents is not a big drama.
However, I am writing this code for a demonstration, hands-on, course on molecular dynamics, so having a code as close as possible as the text-formulas is important. In this case, in particular, the difference is the simulation taking 1 minute vs. taking 20 minutes, thus completely changing the flow of the course.
As a side-note, gfortran does seem to optimize these exponentiations, and the code is fast in the “natural” version.
Thus, my questions are: is there any important/fundamental that is impairing Julia to optimize these exponents, is it just an implementation detail that is on the list, or what?