Numerically stable derivative of logerf without catastrophic cancellation

I need to evaluate the derivative of a logerf function fun with respect to x within the widest possible range of x. At the moment, the derivative is calculated through the function dfun.

using LogExpFunctions
using FiniteDifferences
using SpecialFunctions
using KahanSummation

function ksum(x...)
    return sum_kbn(x)

function fun(x,a,b)
    return logerf(a+x,b+x)

function dfun(x,a,b)

    ldeno = fun(x,a,b)
    v1 = -(b+x)^2 
    v2 = -(a+x)^2

    if (v1 < v2)
        t = ksum(-ldeno,v2,log(2/(sqrt(pi))),log1p(-exp(v1-v2)) )
        return -exp(t)
        t = ksum(-ldeno,v1,log(2/(sqrt(pi))),log1p(-exp(v2-v1)) )
        return exp(t)


x0 = -3*10.0^4
a = -3000.0
b = 3000.0

d1 = central_fdm(5,1)(t->fun(t,a,b),x0)
d2 = dfun(x0,a,b)


With the current values of a and b, the finite-difference approximation and the handcrafted derivative function agree well up to |x| < 10.0^4. For higher values of x, the derivatives begin to differ, until around x ā‰ˆ 10.0^9, the handcrafted derivative function starts to return a constant value of -1.1284 or so.
Meanwhile, the numerical precision of logerf implementation allows to evaluate fun up to 10^19 without double point precision under- or overflows.

Can the derivative of the presented logerf function be evaluated more precisely without using tricks like BigFloat? I already use logarithm transform trick in the evaluation.

The problem seems the first term v2 - ldeno or v1 - ldeno, which suffers a catastrophic cancellation for large arguments. The basic probem is that computing the two terms v2 (or v1) and ldeno separately loses everything, because the difference you want is in the least significant digits that have already been rounded off when v2 (or v1) and ldeno are computed ā€” there is no way to get these digits back when you subtract the two nearly equal quantities.

(Kahan summation is about reducing error accumulation in long sums, and is worthless at eliminating catastrophic cancellation from just 2 terms. I would omit it.)

The solution is do the cancellation analytically by digging into the logerf computationā€¦

1 Like

In particular, with this kind of thing, the ā€œscaledā€ special functions are often your friends ā€” here, \text{erfcx}(x) = e^{x^2} \text{erfc}(x), which gives various identities

\log \text{erf}(u,v) = \log \left[ e^{-u^2} \text{erfcx}(u) - e^{-v^2} \text{erfcx}(v) \right] \\ = -u^2 + \log \left[ \text{erfcx}(u) - e^{u^2-v^2} \text{erfcx}(v) \right] \\ = -u^2 + \log \left[ e^{u^2-v^2} \text{erfcx}(-v) -\text{erfcx}(-u) \right] \\ = -v^2 + \log \left[ e^{v^2-u^2} \text{erfcx}(u) - \text{erfcx}(v) \right] \\ = -v^2 + \log \left[ \text{erfcx}(-v) - e^{v^2-u^2} \text{erfcx}(-u) \right] \\

By selecting the appropriate version of these identities, you can cancel the -u^2 or -v^2 term analytically in your sum.

I whipped something together based on this, and it seems mostly right at first glance, but there may be some corner cases I havenā€™t handled correctly. Hopefully it will be a useful starting point for you, at least:

function dfun2(x,a,b)
    u, v  = a+x, b+x
    uĀ²,vĀ² = u^2, v^2
    c = 2 / sqrt(oftype(u+v, pi))
    if uĀ² < vĀ²
        if v < 0
            t = log1p(-exp(uĀ²-vĀ²)) - log(exp(uĀ²-vĀ²)*erfcx(-v) - erfcx(-u))
            t = log1p(-exp(uĀ²-vĀ²)) - log(erfcx(u) - exp(uĀ²-vĀ²)*erfcx(v))
        return -c * exp(t)
        if u < 0
            t = log1p(-exp(vĀ²-uĀ²)) - log(erfcx(-v) - exp(vĀ²-uĀ²)*erfcx(-u))
            t = log1p(-exp(vĀ²-uĀ²)) - log(exp(vĀ²-uĀ²)*erfcx(u) - erfcx(v))
        return c * exp(t)

it would be cleaner to just swap u^2 and v^2 rather than duplicating logic.


That seems to do the trick, many thanks!