Need help avoiding catastrophic cancellation or overflow

I am computing integrals of the form \int_s^t e^{-a -bx-cx^2}dx. The solution is available in closed form in a number of cases such as (1) c>0 and (2) c==0 and b!=0. The problem is that the closed form solution for the case with c>0 gets some 0/0 errors as c approaches 0. I have switched to using numerical quadrature which is insensitive to the issue, but I would appreciate if anybody has suggestions for a better way to code up the solutions.

Specifically, the solution for case 1 is \frac{\sqrt{\pi}}{2\sqrt{c}} e^{b^2/4c - a} (erf(\frac{b+2ct}{2\sqrt{c}}) - erf(\frac{b+2cs}{2\sqrt{c}})). The issue is that \frac{\sqrt{\pi}}{2\sqrt{c}} e^{b^2/4c} obviously goes haywire as c goes to 0 (the erf terms go to zero). As a result, c doesn’t have to get very small before I start getting NaN. Notably, if I switch to big(c) floats, the issues go away. However, I would rather not go down that path. I’m hoping someone can provide a new way to compute this that sidesteps the current deficiency.

The solution for case 2 is e^{-a}(e^{-bs}-e^{-bt})/b but does not really suffer from the same problem.

using LinearAlgebra
using SpecialFunctions
using QuadGK

# helper fun for case 1
erftransform(val,b,c) = (b+2*c*val)/(2*sqrt(c))

# case 1 solution
_integratesegment(mL,mU,a,b,c) = sqrt(pi)/(2*sqrt(c)) * exp(b^2/(4*c) - a) * erf(erftransform(mL,b,c),erftransform(mU,b,c))

# case 2 solution
_integratesegment(mL,mU,a,b) = exp(-a)*(exp(-b*mL)-exp(-b*mU))/b

# convenience function to handle branches
function integratesegment(mL,mU,a,b,c)
    if c <= 1e-16 # sloppy branching...
        iszero(b) && return exp(-a)*(mU-mL)
        return _integratesegment(mL,mU,a,b)
    else
        return _integratesegment(mL,mU,a,b,c)
    end
end

# numerical integration for checking
function integratesegmentgk(mL,mU,a,b,c)
	hfun(x) = exp(-a -b*x -c*x^2)
	res,tol = quadgk(hfun,mL,mU)
	return res
end

cvals = [0, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1]

integratesegment.(0,1,1,-1,cvals)
#7-element Vector{Float64}:
#  0.6321205588285577
#NaN
#NaN
# 0.6318564031364802
# 0.629486671230845
# 0.6065301940547417
# 0.43578743768823325

integratesegmentgk.(0,1,1,-1,cvals)
#7-element Vector{Float64}:
# 0.6321205588285577
# 0.6321179164259259
# 0.632094135571238
# 0.6318564031364771
# 0.62948667123084
# 0.6065301940547421
# 0.43578743768823336

integratesegment.(0,1,1,-1,big.(cvals))
#7-element Vector{AbstractFloat}:
# 0.6321205588285577
# 0.6321179164259257048642256113765913352730335321928808376341466963000811949775715
# 0.6320941355712378411437988074357461837728588847560930563234431983546833766473449
# 0.6318564031364769326148887213013742689733182220055752005550532909361636641234183
# 0.6294866712308398372144996839722568272635753665524012813369407729846853283018586
# 0.6065301940547420107510309245899038352682705934158216165434799660850915880258804
# 0.4357874376882332994275778807401227237933115845358139306442570140687508954685834

I guess that the case 1 must have \sqrt \pi instead of \pi.

One possible route to try is to expand e^{-cx^2}=1-cx^2+O(c^2)

and then do the integral:

e^{-a}\int_s^t e^{-bx}(1-cx^2+...)dx=e^{-a}(\frac{e^{-bs}-e^{-bt}}{b}+c \frac {e^{-b t} (-2 - b t (2 + b t))-e^{-b s} (-2 - b s (2 + b s))}{b^3})+O(c^2)

With more terms it will be more accurate but at least it recover the expected result for c=0 and it seems that you will have good results for b^2 > c (otherwise you have the closed solution 1)

Good luck…

1 Like

Yes sorry I see now that the latex had a typo! I will edit it.

Also, I will see about using a Taylor expansion, which is not something I had really thought about.

I was thinking not along the lines of an approximation but simply a different way to package the computations so that the bad behaviors went away.

Just use the \mathrm{erfcx}(x) = e^{x^2} \mathrm{erfc}(x) function (included in SpecialFunctions.jl). In particular, your case-1 expression becomes:

\frac{\sqrt{\pi}}{2\sqrt{c}} e^{b^2/4c - a} \left[e^{-(b+2cs)^2/4c}\mathrm{erfcx}\left(\frac{b+2cs}{2\sqrt{c}}\right) - e^{-(b+2ct)^2/4c}\mathrm{erfcx}\left(\frac{b+2ct}{2\sqrt{c}}\right)\right] \\ = \frac{\sqrt{\pi}}{2\sqrt{c}} e^{-a} \left[e^{-bs-cs^2}\mathrm{erfcx}\left(\frac{b+2cs}{2\sqrt{c}}\right) - e^{-bt-ct^2}\mathrm{erfcx}\left(\frac{b+2ct}{2\sqrt{c}}\right)\right]

which is much better behaved for small c>0 (no problems with overflow). You’ll still want to use the case-2 formula exactly at c = 0, but otherwise you should be fine.

9 Likes

Thanks for taking a look Steven. I verified your formula, but it seems not to really behave any better. It does behave better when b is positive, but in my case b will be positive only about half of the time. When b is negative like the MWE in the original post, your rewrite gives NaN for small c and gives nonsense results when big(c) is used.

using SpecialFunctions
erftransform(val,b,c) = (b+2*c*val)/(2*sqrt(c))
erftransformsgj(val,b,c) = exp(-b*val - c*val^2)*erfcx(erftransform(val,b,c))

function integratesegmentsgj(mL,mU,a,b,c)
	if c <= 0
		iszero(b) && return exp(-a)*(mU-mL)
		return exp(-a)*(exp(-b*mL)-exp(-b*mU))/b
    else
        return sqrt(pi)/(2*sqrt(c)) * exp(-a)* (erftransformsgj(mL,b,c) - erftransformsgj(mU,b,c))
    end
end

cvals = [0, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1]

integratesegmentsgj.(0,1,1,-1,cvals)
#7-element Vector{Float64}:
#   0.6321205588285577
# NaN
# NaN
#  4.0215245211834096e96
#   0.6320906688380318
#   0.6065301940547216
#   0.43578743768823336

integratesegmentsgj.(0,1,1,-1,big.(cvals))
#7-element Vector{AbstractFloat}:
#  0.6321205588285577
#1.343104215291380647530388480807286209268696364530121478518807903695589563920231e+10787
#3.679560299841559630427548397181294030032348329205132231280946683155788742871513e+1010
#-1.706318763792106916171342435659577622778170598603821657808120869231797415983239e+35
#0.6294866712308398584816157590603934840746038725806679321675664317356159846250168
#0.6065301940547420312425657410239570180704723224570818873520949455814748757171925
#0.4357874376882333141505933867860602488507980327823414407255768784178921779538312

Without looking in detail at what you are doing my favorite resource is Abramovitz and Stegun “Handbook of Mathematical Functions” a Dover publication. All sorts of interesting identities and approximations in there.

1 Like

I think in that case you need to switch the sign around, so that instead of substituting in

\mathrm{erf}(x) = 1 - \mathrm{erfc}(x) = 1-e^{-x^2}\mathrm{erfcx}(x)

you instead use

\mathrm{erf}(x) = -\mathrm{erf}(-x) = \mathrm{erfc}(-x)-1 = e^{-x^2}\mathrm{erfcx}(-x) - 1
2 Likes

Also note the expected behavior of erfc(x) when x is large. erfc(27.0) ≈ 5.237e-319 which exceeds floatmax(Float64).

1 Like

Yea something like that must be happening. For example, in the rewritten version the leading term isn’t so crazy for the values of c I’ve put in the MWE.

julia> sqrt(pi)*exp(-1)*0.5./sqrt.(cvals)
7-element Vector{Float64}:
  Inf
 103.09805182296564
  32.602466608664606
  10.309805182296564
   3.2602466608664606
   1.0309805182296563
   0.3260246660866461

The final result for the smaller values of c is about 0.63, so everything else in the expression should actually evaluate to not such an extreme number… 0.63/103 = 0.006 for the first nonzero c case.

The problem is that erfcx blows up exponentially for negative arguments, so if you have negative b and small c then it will quickly give Inf, and then you will get NaN from Inf - Inf.

However, there is a simple solution: for b < 0, do a change of variables x \to -x in your integral, which corresponds to flipping the signs of b, s, t and swapping the latter. That is:

function f(a,b,c,s,t)
    b < 0 && return f(a,-b,c,-t,-s)
    c == 0 && return exp(-a) * (exp(-b*s) - exp(-b*t)) / b
    return sqrt(pi/4c) * exp(-a) * (exp(-b*s-c*s^2) * erfcx((b+2c*s)/2√c) -
                                    exp(-b*t-c*t^2) * erfcx((b+2c*t)/2√c))
end

Then small c is not a problem regardless of the sign of b:

julia> f(1,2,1e-100,3,4)
0.00039423608073391826

julia> f(1,-2,1e-100,3,4)
474.1099996629409

(Note also that you only need to separate the c == 0 case, not some tricky-to-get-right precision-dependent threshold like c < 1e-16.)

4 Likes

There a few other corner cases, for example if you want to handle b == c == 0 then you need to return exp(-a) * (t - s) in that case to avoid getting a NaN from 0/0.

(Writing a special function starts by getting the gross details right, but then turns into an exercise in tracking down all the limiting cases.)

3 Likes

This is great! I guess this is the part that @simonbyrne was hinting at but I didn’t fully comprehend that I also needed to adjust the limits of integration as well.

Also the c < 1e-16 branching was simply a holdover from a time before I knew why the function was breaking down.

I will handle the corner cases.

PS. @Dictino I did experiment with a Taylor expansion about the point (s+t)/2 (up to order 5). It works great when c is small, but when c gets larger it is less good. That approach would have required some heuristic for switching between the exact solution and the approximation.

1 Like

In particular, I would use something like

c == 0 && return b == 0 ? exp(-a) * (t - s) : exp(-a) * (expm1(-b*s) - expm1(-b*t)) / b

where the expm1 here prevents catastrophic cancellation for small b.

Another tricky case is the one where both b and c are very small but nonzero.

1 Like

Yes, the ultimate fallback method for special functions is to stitch together approximations (Taylor series, asymptotic series, continued-fraction expansions…) in different regions of the domain. It requires a lot of fiddling to determine the degree of the approximation and the transition points between approximations to achieve a given accuracy. For a relatively simple example, see e.g. more accurate and type-stable sinc/cosc by stevengj · Pull Request #37273 · JuliaLang/julia · GitHub

1 Like

A similar example of this type is given in the HypergeometricFunctions.jl package for conditionally evaluating the Gauss hypergeometric function: