Techniques to improve resolution of a system of nonlinear equations

Hello,

I would like to solve a set of N = 300 independent 1D nonlinear equations of the type f_i(x_i) = a_i where the a_i are given and each function f_i is a univariate strictly increasing bijection. There is a unique solution for each nonlinear equation. So far, I am using NLsolve.jl with the Newton method to solve simultaneously the equations. I am already using the fact that the Jacobian is diagonal (all the problems are independent).

However the f_i have some bumps (see Figure) that are sometimes challenging for the Newton method if the initial condition is far from the root. How can I improve the convergence?
Is there a way to exploit that the Jacobian has only strictly positive values?

monotone

1 Like

If the problems are independent, why don’t you just solve each equation separately?

5 Likes

Yes, since it’s 1D and monotonically increasing you can use a bracketing method like bisection if you find one value to the left and one value to the right.

4 Likes

Doesn’t Newton’s method converge faster than bisection (at least once you’re close)?

2 Likes

Bisection is much more stable though. The best thing to do would be to use a Falsi method like in https://github.com/JuliaComputing/NonlinearSolve.jl which is a mixture of bracketing for stability but derivative-based for acceleration.

7 Likes

The main reason is that the Newton method is much faster to converge than the bisection method.

I have added a line search and that seems to solve the problem.
Thank you all for your recommendations.

I will check NonlinearSolve.jl

1 Like

for reference if you’re not familiar with the method: https://en.wikipedia.org/wiki/Regula_falsi#The_regula_falsi_(false_position)_method

3 Likes

Bisection is much more robust, and is derivative-free.

For problems with “bumps” like this, plain vanilla Newton’s method can easily diverge. There are safeguards against this, but why bother when you have bisection.

Newton’s method (in its original, simple form) is not a practical general method for univariate or multivariate problems. It is useful when you can prove reasonable convergence analytically, which happens for globally “nice” problems, which are quite rare.

1 Like

I thought the standard (?) method for 1d functions was Brent’s method, which combines bisection and Newton to give a method that is guaranteed to converge (if I remember correctly). I believe this is implemented in the Roots.jl package.

If you need guarantees then there’s IntervalRootFinding.jl

4 Likes

Not quite — it uses the secant method, so it does not need derivatives directly.

Brent’s method can be better than bisection for some problems, but I think that first we should establish whether the problem requires a multivariate solver. If the problem is separable, the major efficiency gain will be from that, and the choice of univariate method is of secondary importance compared to it.

3 Likes

Yes, the problem is separable (all the equations are independent), but my code works best if I perform the evaluation of all the f_i(x_i) at once. Also, the computation of the Jacobian is less expensive than the evaluation of the function.

Start with bisection and turn on Newton when things start looking good (ie all the |f|s are small). This is the algorithm HP used years ago for scalar equations in their programmable calculators.

1 Like

There is no technical reason that should prevent you from implementing a coordinate-wise bisection or other method, it is just a matter of bookkeeping, the algorithm is the same.

This is wise advice. You can use a scalar bisection + Newton-Armijo solver (or roll you own) and loop over the functions. In that way you are likely to converge for most of them and will be able to identify any corner cases for special attention. The line search is really important for problems like this.

One problem with using Newton-Armijo for systems to solve this problem is that the hardest of the equations will govern the line search for all of them. Take this example, please

f1(x) = atan(x); f2(x) = atan(10x); f3(x) = atan(100x);

with x0=1 as the initial iterate for all three equations.

f3 = 0 is a much harder problem for Newton-Armijo that f1=0, which is easy. Solving as a system will make all three problems appear to be hard.

2 Likes