What nonlinear solvers are available?

I am trying to test nonlinear systems solvers, preferably with lots of available algorithms (analogous to the packages Optim.jl and NLopt.jl for optimization). I know of NLsolve.jl, IntervalRootFinding.jl, and NonlinearSolvers.jl. Any others?

NonlinearSolvers.jl wraps everything that we know of in the Julia ecosystem. If it’s missing a method then open an issue and we can wrap that as well. You might want to throw ModelingToolkit.jl into the conversation because its tearing pass will accelerate any of the nonlinear solvers too, so it’s a pretty crucial part of the process for many applications.

3 Likes

NLSolvers.jl that will be finished and polished once I have time… but I’ve been saying it for a while :grinning_face_with_smiling_eyes: It is tagged, but the interface is not fully stable. It contains code similar to Optim and NLsolve (and the optimization part of LsqFit) in one bundle with relatively few dependencies, and is intended to be the new code for Optim and make NLsolve obsolete. The dependency part is less relevant compared to the past, but it is intended to be relatively light weight. It also has several methods that are non-allocating or at least almost non-allocating when used with static arrays or similar. I would appreciate any test cases you might have.

5 Likes

@ChrisRackauckas @pkofod I have been wondering about the need for having to create a “problem” that gets fed into the solver, as opposed to just feed a plain function to the solver like in Optim or NLsolve. Why is that? is to take advantage of multiple dispatch somehow?

I made the issue here :
https://github.com/SciML/SciMLNLSolve.jl/issues/7

2 Likes

an OptimizationProblem struct is fully specified in the sense that the way of calculating everything (function, gradient, hessian, jacobian, inplace operations) is defined. if you only pass a function to a solver, you pass that work on deciding how to calculate everything on to the solver itself, and the solver defaults may not suit the problem at hand

2 Likes

But you can pass custom gradients, to Optim for example is the second positional argument.

You can also pass the gradient to an OptimizationFunction constructor, the customization options are the same between arguments, but, looking at the dispatches in optim at least, it complicates the code a lot for little benefit over just passing an object with all the relevant information. For one example of this, the Optim IPNewton solver already requires that you pass 2 objects + the initial point.

3 Likes

I think the approach in Optim is quite user friend if you just want to optimize(f, x) but there are a couple of problems as a package developer.

A lot of bug fixes to Optim has been to fix confusing error when a specific combination of f, g, fg, fgh, h, x, lower bounds, upper bounds, algorithm, options, … were passed in. Since f, g, fg, fgh, and h are not typed, it’s hard to dispatch properly. That was the original motivation for making the NDifferentiable wrappers in Optim and NLsolve (not always used by the user, but used internally). This makes it much easier to write the “backend” code because you know you’ll just accept a bundle with all the relevant methods and information once you reach the algorithm code, and you don’t have to pass around 20 positional arguments.

In NLSolvers.jl I have problem structs as well, but they are always the thing you put into your solve call. This is again for my own sake, to make it easier to pass around information and to know that the algorithm has the information it needs. I think it will also be a good workflow for many users to first define their problem and then call solve on it. I am undecided if I will still provide simple minimize(f, x) types of functions but I might, because it’s just very convenient in many cases.

4 Likes

There’s multiple reasons for the split. And note that for the simple cases Julia will remove the struct at the compiler level anyways so it’s not a hit on performance either. So the reason to do it is:

  1. You need to do this internally anyways, since you want to just throw one bundle of the model around your inner functions.
  2. Having the user interface being around the model bundle allows for automated analysis. modelingtoolkitize(prob) can automatically take it and accelerate it, OptimizationFunction(...) can generate the AD parts in a very generic way, etc. Again, internally you always want to work on the bundle of all things that define the model, so if you want to expose some advanced compiler-y features to the user, you need them to work on that bundle as well.
  3. Separation of concerns. How to define a problem vs how to solve a problem. It makes the documentation of options much more clear.
  4. Automated conversion of problems. It hasn’t come up in optimization, but SecondOrderODEProblem generates a first order ODEProblem, which then solves normally, etc. Again, if you want functionality on the model itself, you need the user to interact with the model definition bundle.
  5. Specialization. Most people don’t know that overloading optimize(f,x) is slow unless you do optimize(f::F,x) where F… so if you want a dispatching interface optimize(f,alg,x) that other packages join into, it’s much easier to handle it via problem types and not police all downstream about specialization heuristics.
5 Likes

6 posts were split to a new topic: How to help nonlinear solvers by applying bounds

Here is one such test case: