TLDR: Is there a way to do multi-dimensional interval root-finding with NonlinearSolve.jl? From my understanding, it only supports one-dimensional interval root-finding right now, so I was wondering whether there’s an ideal way around this, like modifying the objective function such that attempted roots that are outside my interval are penalized.
I need to do root finding on a 3D system where all my functions are ideally only defined over a bounded domain (they’re defined with Interpolations.jl but I want to avoid extrapolation as much as possible). I know that these roots actually live within the domain, but the solver often wants to ATTEMPT off-domain points at the boundaries and I want to rule that out.
(I’m open to using other packages. I know IntervalRootFinding.jl exists and it’d be perfect for me, but I can’t get it to work with functions defined with Interpolations.jl.)
Unfortunately, most univariate algorithms for rootfinding, especially the derivative-free / bracketing ones, do not generalize to multivariate settings.
You have two alternatives (I know of). Rootfinding can be cast as an optimization problem, minimizing a norm, so you can use Optim.jl or one of the other libraries, which handle upper/lower bounds. This is what I would do.
In case you do not find that appealing, you can extend your function like this: find the nearest point within the domain (see clamp), then calculate the distance to that point and add a penalty based on that. This is rather ad-hoc, and I imagine the method above would be better, but for a well-behaved 3D problem this should work too.
This seems like a sub-optimal choice, for the same reason that generic optimization algorithm are often sub-optimal for least-squares problems (see e.g. Nocedal and Wright, chapter 10).
In particular, if you are solving f(x) = 0 by minimizing g(x) = \frac{1}{2} \Vert f(x) \Vert^2 = \frac{1}{2} f(x)^T f(x) for a function f: \mathbb{R}^n \mapsto \mathbb{R}^n, then realize that g(x) has a gradient and Hessian matrix given by:
\nabla g = f'^T f \, , \quad \text{Hessian }g''= f'^T f' + \sum_{i=1}^n f_j f_j'' \, ,
where f' is the n \times n Jacobian matrix of f. However, if you are close to the root f \approx 0, then the Hessian g'' is dominated by the first term g'' \approx f'^T f', which means that you can compute a good approximation for the Hessian “for free” knowing only the first derivative (the Jacobian) of f.
That’s why specialized nonlinear least-squares algorithms like Levenberg–Marquardt are often superior to generic optimization algorithms for least-squares problems, and this seems doubly true for root-finding problems where you know f=0 at the minimum. Levenberg–Marquardt can take approximate Newton steps given only f'.
For example, LeastSquaresOptim.jl supports bound constraints. Of course, it might then converge to a spurious minimum (not a root) up against the constraint. It might do that even without constraints, of course.
If you have the Jacobian f', you could implement a multidimensional Newton step in a few lines of code, since it is just
x \longleftarrow x - f'(x)^{-1} f(x)
and you can just clamp this to ensure that it doesn’t take steps outside the domain (though this may again get stuck if your function has a local minimum near the boundary, or anywhere for that matter — you can only guarantee a root from Newton methods if you start sufficiently close to it, or if your function is nicely behaved).
Thanks for the correction! I am curious what you think of Levenberg-Marquardt vs trust region methods for nonlinear least squares. I have been using the latter (with dogleg) so far, but a colleague recommended L-M as more robust.
Constrained root finding can be cast as the “feasibility problem”
Minimize 0 subject to F(x) = 0 and ℓ ≤ x ≤ u
where the objective function is zero (or any constant). You can pass the above problem to, e.g., IPOPT, which will honor the bound constraints.
It’s a “feature” of every smooth optimization solver that it will either find a feasible point (which is what you are after here), or an infeasible one where ‖F(x)‖ is minimized (called a stationary point of infeasibility).
You can try other solvers that accept general constraints, not just IPOPT. Because IPOPT is an interior point method, it will not attempt trial points outside the bounds.