I am computing 2 integrals over the real line using quadgk
from the package of the same name. The goal is to estimate the ratio of the integrals. Often both integrals are quite small, especially in some extreme cases I’m using to test. Getting reasonable values for the numerator has been tricky, and I’m hoping for some advice on how to handle it.
The denominator is the integral of a positive function, and seems to work well with only a relative tolerance. But the numerator might be 0, and so using an absolute tolerance seems advisable. I have tried scaling it by the size of the denominator, but sometimes the result is much smaller than it should be. Removing the absolute tolerance fixes things, but that re-opens the possible bad behavior for true 0.
Sample code:
# illustrates problems with premature termination of integration
using QuadGK
"conditional probability of n successes out of n tries"
function conddens(n::Int, z::Float64)
η = -2.0+0.25*z # transform z to mean -2, sd 0.25
return (1.0+exp(-η))^(-n)
end
"combined normal density and weight function, ignoring constants"
function dens2(z::Float64)
λ = 0.48
return exp((2*λ-1)*z^2/2.0)
end
function trouble(n::Int)
f0(z) = conddens(n, z)*dens2(z)
r0, r0err = quadgk(f0, -Inf, Inf, atol=0.0)
f1(z) = z*conddens(n, z)*dens2(z) #same as z*f0(z)
tol1 = r0*sqrt(eps())
r1, r1err = quadgk(f1, -Inf, Inf, atol=tol1)
r1b, r1berr = quadgk(f1, -Inf, Inf, atol=0.0)
println("Denominator = $(r0) ± $(r0err).")
println("Numerator = $(r1) ± $(r1err) with atol = $(tol1).")
println("Numerator = $(r1b) ± $(r1berr) with no atol.")
end
trouble(100)
Produces
Denominator = 1.7081181337842018e-5 ± 9.259586775755586e-14.
Numerator = 7.122927197595122e-22 ± 7.494964918478972e-22 with atol = 2.5452943649652627e-13.
Numerator = 0.0003717149762667195 ± 3.760194450016882e-12 with no atol.
In these problem cases the mode of the function being integrated can be relatively far out, like at 10 to 15. I presume in the failure the region with the mode is not discovered.
Another way to fix the problem, at least in this case, is to specify order=30 to the integral with the absolute tolerance. The documentation says “It is advisable to increase the integration order in rough proportion to the precision, for smooth integrands.” I can’t turn that into an operational rule, beyond “increase the order if you increase precision.” I’d guess “precision” refers to the absolute value of the power of 10 for the desired accuracy, and that the proportion is relative to the default of 7 for the order and a tolerance of … sqrt(machine epsilon)?
All of these approaches seem very brute force compared to starting with a line search for the mode. I understand quadgk
can’t assume anything about the function it is integrating, but in my case they seem to be fairly smooth, unimodal, functions that would lend themselves well to a line search. It is conceivable there might be 2 humps, however.
The presence of most of the mass far out makes Gauss-Hermite quadrature unappealing despite the presence of a normal density in the function being evaluated.