References and implementation for safe/hybrid univariate Newton's method

There exist variants of Newton’s method with safeguards against non-convergence (escape bracket, not converging “fast enough”, etc, in which case they switch to a bracketing method like Brent’s or bisection). Various implementations I found differ in details, and I could not find a paper with convergence analysis.

  1. Is there a reference that discusses these methods in detail?

  2. Is there a Julia implementation?

For a while, anytime I needed to solve a new univariate problem I tried different univariate solvers and always found that Brent’s method (from Optim.jl) was performing the best. So now I just always reach for that.

I do remember seeing recently a post here about someone implementing the ITP method in a new package but I have not given it a try. This method would be more similar to what you described.

No, the ITP method is not really similar, as it does not use derivatives, and it is already implemented in Roots.jl (thanks @TheLateKronos!)

Just to clarify: I am not asking for recommendations on rootfinding methods.

The question is about a very specific family of methods, and is mainly motivated by curiosity. Two general recommendations I encounter are

  1. get close to a root, then use Newton-Raphson for “polishing”,
  2. try Newton-Raphson, fall back to some bracketing method if it “fails”

But in practice I find it hard to implement these in a robust and practical way, because the criteria for switching are quite fuzzy. This is a shame, since for some functions derivatives are readily available, so a bracketing method that uses derivatives would be useful.

1 Like

The Numerical Recipes book, chapter 9, covers this topic in detail.

If you are referring to rtsafe in Section 9.4 of the 3rd edition, then it doesn’t. It is one of the ad-hoc implementations floating around. All you get is the description

The hybrid algorithm takes a bisection step whenever Newton-Raphson would take the solution out of bounds, or whenever Newton-Raphson is not reducing the size of the brackets rapidly enough.

but a convergence analysis or a justification for the criteria is missing.

Again, it is very easy (and tempting) to make ad hoc modifications to existing algorithms. Unfortunately, these can easily backfire.

If we are talking about a continuous function and a bracket which is always at least cut in half, then we have a guarantee that the method converges by the intermediate value theorem. Taking the newton step when it makes the interval smaller than half it’s current size and the bisection step otherwise can only make convergence faster as the interval will always be smaller than or equal to half it’s initial size at every step.

Not sure what else you are looking for.

1 Like

I am not sure how that follows. In my copy of the book the step is taken if it does not go out of the bracket and

|2f(x_m)| < |x_2 - x_1||f'(x_m)|

where (x_1, x_2) is the original bracket, and x_m = (x_1 + x_2)/2.

The new value is x_m - f(x_m)/f'(x_m), which could be close to x_1 or x_2.

To make it a bit more concrete, consider x_1 = 0, x_2 = 1, f(0) = -1, f(1) = 2, f(0.5) = 1, f'(0.5) = a. The second condition for accepting the Newton step is

2 < |a|

while the resulting point is 0.5 - 2/a. Now let a = 4.1, so that x_3 \approx 0.01. If f(x_3) < 0, (x_3, x_2) will be the new bracket, and is about 99% of the old one.

Following the notation in the 3rd edition of the book, if I am not mistaken, the next point computed seems to be 0.378:

x1, x2 = 0.0, 1.0
fl, fh = -1.0, 2.0
xl, xh = x1, x2
rts = 0.5 * (x1 + x2)   # = 0.5
dxold = abs(x2-x1)
dx = dxold
f = 0.5
df = 4.1

(((rts-xh)*df-f)*((rts-xl)*df-f) > 0.0) || (abs(2.0*f) > abs(dxold*df))  # false

# so take the Newton step:
dxold = dx;
dx = f/df
temp = rts
rts -= dx      # new rts = 0.378

I vaguely remember W. Kahan giving a talk about how he got this idea into the solver (the solve button) for the HP35 programmable calculator. This was from the late 1970s and I do not know if he published it or not. I think the opening gambit was bisection and then the solver switched to Newton when the intervals got small.

This article by W. Kahan about the HP-34C solver seems to be based on the secant method.

2 Likes

That’s the one. Thanks for doing the error correction Newton → secant from my fading memory.

That is implemented in the default Order0 method of find_zero in Roots. There is also the possibility of other hybrid methods, switching to a bracketing method if one is identified.

2 Likes

Good point, thanks for working through the numbers (which I should have done before posting the example :blush: ).

Try f(0.5) = -0.5 and f´(0.5) = -(1 + \epsilon), which gets you arbitrarily close to 0, so the new bracket is (\approx 0, 1). Eg

f = -0.5
df = -1.1

gives rts = 0.04545...

I forget if the convergence analysis is in detail, but in the LithBoonkkampIJzermanBracket method of find_zero from Roots there is an implementation of this kind of algorithm and a reference to follow up on.

3 Likes

Thanks, the article (non-paywall version here) was extremely informative. This is the kind of answer I was looking for, and I am delighted to learn that it is already implemented in the excellent Roots.jl.

1 Like