Hello! New Julia user here.
Following this post, I try to learn the way to speed up sums / integrals. I am doing the following test:
using BenchmarkTools
xvec = range(0, 4, length=1000)
function fdot()
function _f1(x)
ans = 1/(1+x^2)
return ans
end
ans = sum(@. xvec * _f1(xvec))
return ans
end
function fcompre()
function _f2(x)
ans = 1/(1+x^2)
return ans
end
ans = sum(x * _f2(x) for x in xvec)
return ans
end
function floop()
function _f3(x)
ans = 1/(1+x^2)
return ans
end
ans = 0.0
for x in xvec
ans += x * _f3(x)
end
return ans
end
@btime $fdot()
@btime $fcompre()
@btime $floop()
First, you are breaking the very first rule in the performance tips: you are using a global variable xvec, rather than passing it as a parameter.
Second, you inner functions _f1(x) etcetera are referring to the variable ans, and Julia thinks that this is referring to the variable ans in the surrounding scope, which is not what you want (and is leading to unwanted allocations).
A better comparison would be:
function fdot(xvec)
_f1(x) = 1/(1+x^2)
return sum(@. xvec * _f1(xvec))
end
function fcompre(xvec)
_f2(x) = 1/(1+x^2)
return sum(x * _f2(x) for x in xvec)
end
function floop(xvec)
_f3(x) = 1/(1+x^2)
ans = 0.0
for x in xvec
ans += x * _f3(x)
end
return ans
end
Note that fdot is slower because it allocates a temporary vector (to store @. xvec * _f1(xvec)) before calling sum.
I try to learn the way to speed up sums / integrals.
Note also that if you are computing integrals, you can generally do much better by using a quadrature scheme based on unequally-spaced points, e.g. via the QuadGK.jl package.
Thank you for the explanation. May be what I try to understand is the huge difference in choosing one of the option to perform the sum:
function test2()
function fpick()
xvec = range(0.0, 4.0, length=1000)
function _f1(x)
ans = 1/(1+x^2)
return ans
end
# pick one
ans = sum(@. xvec * _f1(xvec))
#ans = sum(x * _f1(x) for x in xvec)
#=
ans = 0.0
for x in xvec
ans += x * _f1(x)
end
=#
return ans
end
fpick()
end
and your answer for the need to temporary store a vector for @. helps.
For integrals, I am using Gaussian Legendre from FastGaussQuadrature.jl, and is surprised to find
xvec, xweight = gausslegendre(N)
# pick one
ans = sum(@. xweight * _f(xvec) )
ans = sum( xw * _f(x) for (x, xw) in zip(xvec, xweight))
gives quite different performance in some (other) integrals I have. Thank you again!
You appear to have missed an important point of what @stevengj was telling you re scope. Note the improvement in speed.
using BenchmarkTools
function test2()
function fpick()
xvec = range(0.0, 4.0, length=1000)
function _f1(x)
local ans = 1/(1+x^2)
return ans
end
# pick one
local ans = sum(@. xvec * _f1(xvec))
#ans = sum(x * _f1(x) for x in xvec)
#=
ans = 0.0
for x in xvec
ans += x * _f1(x)
end
=#
return ans
end
fpick()
end
function test3()
function fpick()
xvec = range(0.0, 4.0, length=1000)
function _f1(x)
local ans = 1/(1+x^2)
return ans
end
# pick one
# local ans = sum(@. xvec * _f1(xvec))
local ans = sum(x * _f1(x) for x in xvec)
#=
ans = 0.0
for x in xvec
ans += x * _f1(x)
end
=#
return ans
end
fpick()
end
Thank you. Indeed I have not understood that point. Now I do! I thought (wrongly) that “ans” inside are unaffected by “ans” outside.
[what I think is still true: “ans” inside _f1, despite the “ans” outside, should be local since it is an assignment (without the global flag), and will not modify the “ans” outside. The functions will get the right answer, but using the same names seems to make it necessary to do a lot of pointless allocations, and that slow things down. ]
And indeed simply renaming / adding local for “ans” inside function speed up the all options. Thank you!