How to improve the implementation of a function involving a numerical integration?


I am trying to optimize my code, but I don’t really know in which direction I should go (but I know I don’t want to use multithreading).

Here is my goal: I want to implement the following function

k_\perp(x,y; l) = k(x,y;l) - \frac{\int_{-1}^{1} p(w)k(x,z;l)dw \times \int_{-1}^{1} p(z)k(z,y;l)dz}{ \int_{-1}^{1} \int_{-1}^{1} p(w)p(z)k(w,z;l)dwdz}

where p(x) = \frac{1}{\pi} \sqrt{1-x^2} and k(x,y;l) = \exp{-\frac{(x-y)^2}{2l^2}} .
(this function will later be used to build a kernel and some optimization will be performed on l)

Unfortunately, this integral has no analytical solution (at least according to Mathematica).

Therefore, I wrote this little code (see below) using QuadGK to compute the function, but I am not really happy regarding the performance and I am pretty sure it could be faster. I also saw that ApproxFun could probably do something interesting using some orthogonal polynomials but did not manage to write a clean code…

using QuadGK, BenchmarkTools
k(x,y,l) = exp(-(x-y)^2/(2*l^2))
p(x) = sqrt(1-x^2)/π

ES(x,l) = quadgk(y -> k(x,y,l) * p(y), -1, 1)[1]
@benchmark ES(0.0, 1e1)

ES2(l) = quadgk(x -> p(x) * ES(x,l), -1, 1)[1]
@benchmark ES2(1e1)

k_orth(x,y,l) = @. k(x,y,l) - ES(x,l) * ES(y,l) / ES2(l)

x = range(-1.0, 1.0, length=100);
y = range(-1.0, 1.0, length=100);

@benchmark k_orth(x, y, 1e1)

contourf(x, y, (x,y) -> k_orth(x, y, 1e1))



One thing worth trying is using a package for multi-dimensional integration (e.g. GitHub - JuliaMath/HCubature.jl: pure-Julia multidimensional h-adaptive integration) rather than nesting 1d integration. Doing so can be a lot more efficient since you get better error control.

1 Like

Note that you are using the default tolerances here — by default, quadgk tries to get at least 8 significant digits. Do you need that kind of accuracy?

What is the typical value of the Gaussian width, l? If it is small (i.e. the function is sharply peaked) then you might want to do something more specialized.

Another basic issue here is that you have a square-root singularity at the edges of your domain, via the p(y) factors. This will slow down convergence substantially unless you employ a specialized quadrature scheme. (I’d eventually like you to be able to specify such endpoint singularities analytically to quadgk.) @lxvm has an experimental package with some support for this called QuadJGK.jl.

Even simpler, do a change of variables y=\cos\theta that eliminates the singularity:

\int_{-1}^{+1} f(y) p(y) dy = \frac{1}{\pi} \int_0^\pi f(\cos\theta) \sin(\theta)^2 d\theta \, ,

since p(\cos\theta) = \sin(\theta)/\pi for \theta \in [0,\pi].

For example, with x=0.1 and l = 10, this change of variables gives the same answer (actually, an even more accurate answer) with 15x fewer function evaluations:

julia> quadgk_count(y -> k(0.1,y,1e1) * p(y), -1, 1)
(0.49935087498090297, 6.368550962729288e-9, 645)

julia> quadgk_count(θ -> k(0.1,cos(θ),1e1) * sin(θ)^2/π, 0,pi)
(0.49935087461297634, 5.176886697100258e-12, 45)

If you lower the tolerance to 1e-4 then you can get another factor of 3 on this example (and it actually is still accurate to machine precision because the integrand is so smooth!):

julia> quadgk_count(θ -> k(0.1,cos(θ),1e1) * sin(θ)^2/π, 0,pi, rtol=1e-4)
(0.4993508746129765, 6.152897258360213e-8, 15)

You can also try pre-allocating buffers to speed things up, but I would focus first on giving QuadGK (or any integration routine!) “nicer” functions to integrate.


Wouldn’t a change of variables to use erf be the way to go for ES?

The change of variables trick works about as well as the specialized quadrature for Jacobi weight functions with QuadJGK.jl. I get:

julia> using QuadJGK
[ Info: Precompiling QuadJGK [f27e79a0-bbaa-47b5-b2d3-8db3cef8b089]

julia> k(x,y,l) = exp(-(x-y)^2/(2*l^2))
k (generic function with 1 method)

julia> p(x) = sqrt(1-x^2)/π
p (generic function with 1 method)

julia> quadjgk_count(y -> k(0.1,y,1e1) * p(y), JacobiSpace(0.5, 0.5, OpenInterval(-1, 1)))
(0.49935087461297667, 1.1102230246251565e-16, 15)

For this example, machine precision is still possible with 7 integrand evaluations

julia> quadjgk_count(y -> k(0.1,y,1e1) * p(y), JacobiSpace(0.5, 0.5, OpenInterval(-1, 1)), order=3)
(0.4993508746129767, 1.6239870559431324e-10, 7)

And the same is true for the second integral

julia> ES(x,l) = quadjgk(y -> k(x,y,l) * p(y), JacobiSpace(0.5, 0.5, OpenInterval(-1, 1)))[1]
ES (generic function with 1 method)

julia> quadjgk_count(x -> p(x) * ES(x,1e1), JacobiSpace(0.5, 0.5, OpenInterval(-1, 1)), order=3)
(0.24937694744330427, 8.052006283953972e-11, 7)

So 7^2 = 49 function evaluations should be sufficient to compute the normalization for this large value of l=1e1. For \ell \ll 1, HCubature.jl is probably the better option for the 2d integral since the integrand evaluations scale as \mathcal{O}(\log(\ell)) as \ell \to 0 for cubatures of a Gaussian localized at the origin, whereas for iterated integration it will be \mathcal{O}(\log^2(\ell)) (i.e. one factor for each integral).

Update: by loosening the error tolerance, it seems like only 5 integrand evaluations are needed for each 1d integral, and the result is still accurate to machine precision

julia> quadjgk_count(y -> k(0.1,y,1e1) * p(y), JacobiSpace(0.5, 0.5, OpenInterval(-1, 1)), order=2, rtol=1e-4)
(0.4993508746129768, 3.8987719608307714e-7, 5)

julia> quadjgk_count(x -> p(x) * ES(x,1e1), JacobiSpace(0.5, 0.5, OpenInterval(-1, 1)), order=2, rtol=1e-4)
(0.24937694744330427, 1.9377299406708737e-7, 5)

Edit: I wasn’t thinking: k(x,y;l) is localized on the line x=y, not at a point, so the scaling of cubature is like \mathcal{O}(\log(l)/l), which makes iterated integration more efficient as l \to 0. It would be hard to get rid of the logarithmic scaling without doing something special


You could precompute ES and ES2 and then simply do interpolation. ApproxFun is pretty good for that, but in reality any interpolation package can do the Job.

You can also do a lot of sneaky things given that you have a gaussian.
You can identify 2 domains

First, you can chop a lot of the domain when l is small enough. In ES(x,l), when x \in (-1,1), solve for w in exp( -(x-w)^2/(2l^2) ) < \epsilon_{tol}, then change your integration limits to be (\max(-1, w_min), \min(1, w_max)).

When l is large you can use the FFT, since ES is actually a convolution between a gaussian (that we will denote by K(x-y, l)) and p (where you have to extend p by 0 outside of (-1,1)). Note that \int_{-1}^1 exp(-i2 \pi \xi x) p(x) = \frac{J_1(2\pi \xi )}{2\pi \xi }. So ES(x,l) = \int_{-\infty}^\infty \frac{J_1(2\pi \xi )}{2\pi \xi } \hat{K}(\xi, l) e^{2\pi i \xi x} d\xi. The bandwidth of \hat{K} is small, so you can use the same trick of choping of the integral to the desired tolerance.

For the second integral, if l is small, since everything is concentrated in the diagonal, probably you are best with higher dimensional techniques, since you don not want to spend that much time off the diagonal.

For the the case of large l, you can do the fourier transform 2 times, so you are only left with 1 integral instead of 2. If I got everything right, ES2(l) = \int_{-\infty}^\infty \left(\frac{J_1(2 \pi \xi)}{2 \pi \xi} \right)^2 \hat{K}(\xi, l) d\xi

Note: When doing the integration in fourier space, after doing the chop of, just use the trapezoidal rule, it will give you exponential convergence.