Hello!
I apologize for using unicode characters before hand, but I really like it I have a function given as:
function ∇ᵢWᵢⱼ(αD,q,xᵢⱼ,h)
# Skip distances outside the support of the kernel:
if q < 0.0 || q > 2.0
return SVector(0.0,0.0,0.0)
end
gradWx = αD * 1/h * (5*(q-2)^3*q)/8 * (xᵢⱼ[1] / (q*h+1e-6))
gradWy = αD * 1/h * (5*(q-2)^3*q)/8 * (xᵢⱼ[2] / (q*h+1e-6))
gradWz = αD * 1/h * (5*(q-2)^3*q)/8 * (xᵢⱼ[3] / (q*h+1e-6))
return SVector(gradWx,gradWy,gradWz)
end
Which by using BenchmarkTools has a performance as such:
αD = 174
q = 1
h = 0.5659
xᵢⱼ = rand(SVector{3,Float64})
@benchmark ∇ᵢWᵢⱼ($αD,$q,$xᵢⱼ,$h)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 6.100 ns … 115.700 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 6.300 ns ┊ GC (median): 0.00%
Time (mean ± σ): 6.382 ns ± 1.985 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▂▁▁▁▁▅▁▁▁▁▁█▁▁▁▁▁▃▁▁▁▁▁▂▁▁▁▁▁▂▁▁▁▁▁▂▁▁▁▁▁▂▁▁▁▁▁▂▁▁▁▁▁▂▁▁▁▁▂ ▂
6.1 ns Histogram: frequency by time 7.1 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
And 6.1 ns is really fast, but any improvement can be of use, since this is a function which I have to run extremely often. I made some improvements to it by combining everything into a factor:
function Optim∇ᵢWᵢⱼ(αD,q,xᵢⱼ,h)
# Skip distances outside the support of the kernel:
if q < 0.0 || q > 2.0
return SVector(0.0,0.0,0.0)
end
Fac = αD * 1/h * (5*(q-2)^3*q)/8 * (1/(q*h+1e-6))
gradWx = Fac * xᵢⱼ[1]
gradWy = Fac * xᵢⱼ[2]
gradWz = Fac * xᵢⱼ[3]
return SVector(gradWx,gradWy,gradWz)
end
Which produces an improvement of ~8%:
@benchmark Optim∇ᵢWᵢⱼ($αD,$q,$xᵢⱼ,$h)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 5.600 ns … 149.000 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 5.700 ns ┊ GC (median): 0.00%
Time (mean ± σ): 5.967 ns ± 2.516 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█▅ ▁
███▅▇▅▅▄▃▄▁▁▁▁▄▁▁▁▁▁▁▃▁▄▄▅▄▄▅▁▁▁▁▁▃▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▅▆ █
5.6 ns Histogram: log(frequency) by time 20.1 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
I’ve seen some of you being able to speed things up in an (to me) incredible way, so I just felt it would be stupid of me not to ask, if someone would know how to make this calculation faster
Kind regards