This is an old post but it seems it hasn’t been discussed yet.
An interesting observation the author found is that non-integer powers in Julia are fairly slow at that time, compared to C++. I am not sure if this is really the case. He mentioned removing those operations speeds up the Julia code by more than 10 times. This seems too much, but maybe at that time there were some problems in the non-integer powers.
Note that this post was in 2019 so kind of old, but I think Julia 1.0 had been released at that time.
Really interesting, indeed its a bit old and I think they wanted to keep things standard julia but with a little of packages you can beat the pythonic version by 9K x without changing much :
using ConcreteStructs,LinearAlgebra, StaticArrays, BenchmarkTools
function generate_particles(PType, n, lambda; l=1, Gamma=SA[0.0,0.0,0.0])
sigma = l/(n-1)*lambda
particles = fill(zero(PType), n^3)
xs = range(0, stop=l, length=n)
# Adds particles in a regular lattice
ind = 1
for k in 1:n
for j in 1:n
for i in 1:n
X = SA[xs[i], xs[j], xs[k]]
particles[ind] = PType(X, Gamma, sigma)
ind += 1
end
end
end
return particles
end
"""
This is a particle struct made up of ambiguous (unspecified) types
"""
@concrete struct ParticleAmbiguous
# User inputs
X # Position
Gamma # Vectorial circulation
sigma # Smoothing radius
# Properties
U # Velocity at particle
J # Jacobian at particle J[i,j]=dUi/dxj
end
ParticleAmbiguous(X, Gamma, sigma; U=zeros(3), J=zeros(3,3)) = ParticleAmbiguous(X, Gamma, sigma, U, J )
# Empty initializer
Base.zero(::Type{<:ParticleAmbiguous}) = ParticleAmbiguous(SA[0.0,0.0,0.0], SA[0.0,0.0,0.0], 0.0)
# Winckelmans regularizing kernel
g_wnk(r) = r^3 * (r^2 + 2.5) / (r^2 + 1)^2.5
dgdr_wnk(r) = 7.5 * r^2 / (r^2 + 1)^3.5
const const4 = 1/(4*pi)
"""
Pythonic programming approach
"""
@views @fastmath function P2P_pythonic(particles, g, dgdr)
for Pi in particles
@inbounds for Pj in particles
dX = Pi.X - Pj.X
r = norm(dX)
if r != 0
gsgm = g(r / Pj.sigma)
dgsgmdr = dgdr(r / Pj.sigma)
crss = cross(-const4 * (dX/r^3), Pj.Gamma)
Pi.U .+= gsgm * crss
@inbounds @simd for j in 1:3
Pi.J[:, j] += ( dX[j] / (Pj.sigma*r) * (dgsgmdr*crss) -
gsgm * (3*dX[j]/r^2) * crss -
gsgm * (const4/r^3) * cross(SVector{3}(i==j for i in 1:3), Pj.Gamma) )
end
end
end
end
end
particles_amb = generate_particles(ParticleAmbiguous, 6, 1.0)
@btime P2P_pythonic($particles_amb, $g_wnk, $dgdr_wnk) # 2ms