Comparing non-linear least squares solvers

user664303 · October 9, 2023, 1:37pm

I’m creating this thread as a resource for people looking to solve non-linear least squares problems in Julia, so they know what tools are available, what features they have and how well they perform.

UPDATE: This comparison has been added to juliapackagecomparisons.github.io

I’ll start with solvers and features. This list is simply what I found and was able to test, and my understanding of the features.

	Ipopt	JSO	NLLSsolver.jl	LeastSquaresOptim.jl
Registered package(s)
Uses JuMP model definition
Bound constraints
Equality constraints
Non-linear constraints
Robustified cost functions
Non-Euclidean variables
Dense auto-differentiation
Supports sparsity
Sparse auto-differentiation

The following performance comparisons are evaluated on an Apple M1 Pro CPU, using this script.

Optimization times on small, dense problems of varying sizes:

All the solvers used auto-differentiation to compute the derivatives. Ipopt got stuck in a local minimum for the final (largest) problem, but all other solvers found the global optimum.

Optimization times on medium sized, sparse problems of varying sizes:

All the solvers except LeastSquaresOptim used auto-differentiation to compute the derivatives. The latter used hand-computed derivatives, since auto-differentiation of sparse systems is not supported. All solvers reached the global optimum.

If you believe other solvers should be in this comparison, please first add them to the benchmark, and send me the updated code (link to a gist is best).

user664303 · October 9, 2023, 1:48pm

In terms of performance it would be nice to benchmark all solvers over a range of problem types and sizes.

For now I’ve benchmarked a few solvers on a multi-dimensional Rosenbrock problem, which leads to a sparse, bi-diagonal linear system. I’ve also added a similar, dense problem.

I’ve now added the graphs to the first post in this thread. I’ll aim to keep this graphs updated as more solvers are added.

odow · October 9, 2023, 2:21pm

I guess the other option to add to the table is JuMP directly with Ipopt. Something like:

using JuMP, Ipopt
function main(f::Function, data::Vector, y::Vector)
    model = Model(Ipopt.Optimizer)
    @variable(model, x)
    @variable(model, residuals[1:length(y)])
    @constraint(model, residuals .== f(data, x) .- y)
    @objective(model, Min, sum(residuals.^2))
    optimize!(model)
    return value.(x)
end

user664303 · October 9, 2023, 4:33pm

I’m definitely interested to try Ipopt. Why do you add residuals as variables with a constraint, instead of just doing this?:
@objective(model, Min, sum((f(data, x) .- y) .^ 2))

stevengj · October 9, 2023, 4:43pm

See also the thread: Suggestions needed for bound-constrained least squares solver

The StructuredOptimization.jl package is worth considering here too.

odow · October 9, 2023, 7:51pm

So that the Hessian is diagonal. Its usually more efficient

mateuszbaran · October 9, 2023, 8:55pm

Manopt.jl also has a non-linear least square solver: Levenberg–Marquardt · Manopt.jl (though it’s not the main focus of the package). It supports manifold constraints only and can be used with AD and sparse Jacobians.

user664303 · October 10, 2023, 8:19am

I don’t understand this. The hessian block for x will (in general) not be diagonal.

In any case I haven’t been able to successfully create a sum of squares cost function for Rosenbrock. Whatever I do generates a compilation error. Here’s one such implementation, along the lines of your approach:

model = Model(Ipopt.Optimizer)
x0 = [(n ≤ 16 && i % 2 == 0) ? 1.0 : -1.2 for i = 1:n]
@variable(model, x[i = 1:n], start = x0[i])
@NLexpression(model, FA[k = 1:(n - 1)], 10 * (x[k + 1] - x[k]^2))
@expression(model, FB[k = 1:(n - 1)], 1 - x[k])
@variable(model, residuals[1:2*(n-1)])
@constraint(model, residuals .== [FA; FB])
@objective(model, Min, sum(residuals .^ 2))

jd-foster · October 10, 2023, 10:29am

The Hessian of the objective function with respect to all problem variables ([x; residuals]) will be diagonal with either 1 or 0s.

Changing

to

@NLconstraint(model, [k=1:2*(n-1)], residuals[k] == [FA; FB][k])

should do it.

user664303 · October 10, 2023, 10:56am

I’m very confused. A simple by-hand derivation of the Hessian of the cost function I define above won’t be either diagonal, or 1 or 0. Note that the variables x are not integers; they’re real values.

Thanks. This worked (ran). However, it’s incredibly slow/fails to converge. Is there a way to set the maximum optimization time in JuMP?

jd-foster · October 10, 2023, 11:04am

I should have said 2 or 0, given the power of 2, but this is the context: Hessian matrix - Wikipedia.
If f(x_1, r_1, r_2) = r_1^2 + r_2^2 then H(x_1, r_1, r_2) = \left(\begin{array}{ccc}0 & 0 & 0 \\0 & 2 & 0 \\0 & 0 & 2\end{array}\right).

What is n in your tests?

JuMP.set_time_limit_sec(model::Model, limit::Real)

user664303 · October 10, 2023, 11:29am

Ah yes. Thanks. But then all the non-linearity is in the equality constraints. Do these not appear in some Hessian of the dual? My memory of Lagrangian multipliers is hazy, but I thought they too had a Hessian? If that information isn’t present in the system then it’s going to converge very slowly. No wonder it performs badly (see the graph below).

Is this sensible, @odow? What would it look like without the equality constraints; i.e. a standard, unconstrained least squares formulation? I can’t get anything written like that to compile.

The largest is 10000.

Thanks. Here’s a plot with the time limited to 30s:

jd-foster · October 10, 2023, 11:37am

Note it looks like you’re simply terminating at 2000 iterations? Based on line 23 in your comparison code:

function NLLStest(sizes)
    # First run the optimzation to compile everything
    options = NLLSOptions(maxiters = 2000)

user664303 · October 10, 2023, 11:43am

Yes, NLLSsolver does indeed terminate after 2000 iterations currently. This happens only for the largest test case, (n=10000). The other solvers terminate after far fewer iterations (and at a higher cost), but after 30 seconds of runtime. Currently NLLSsolver can’t terminate based on runtime, but I will implement that. I’m also planning to add visualizations of the final cost.

In the meantime here’s a graph of tests for which all solvers reached the optimal solution:

kellertuer · October 10, 2023, 1:25pm

That is an interesting comparison, as @MateuszBaran mentioned, Manopt.jl can do least squares on arbitrary manifolds (those from Manifolds.jl), so I would be interested how (and which). non-Euclidean variables are realised in NLLSsolver.jl? Maybe we can join forces a bit in that non-Euclidean matter.

user664303 · October 10, 2023, 1:34pm

I’ve implemented 3D rotation matrices (see the implementation here) and bounded variables (as we discussed before), but it’s easy to add your own. It should be very straightforward to use variables from Manifolds.jl in NLLSsolver.jl. You just need to overload the NLLSsolver.update method for the particular variable type.

I’m happy to discuss it further. Please message me direct.

I’m slowly adding more solvers. I may try to add that one. But it’s a Euclidean problem.

kellertuer · October 10, 2023, 1:38pm

Yes, I remember. We could check whether you could be based on ManifoldsBase.jl (as well) in the update, then it might work on manifold more generally (but I would have to check how you update).

TheLateKronos · October 10, 2023, 7:46pm

This is a great comparison! Would you concider making a PR to add it to JuliaPackageComparisons?

sylvaticus · October 10, 2023, 8:10pm

Just one note on terminology: it is my understanding that Ipopt is the solver, while JuMP is the software that allows you to write your optimisation problem in nice mathematical, solver independent terms and from there it builds the actual problem to pass to the solver…

user664303 · October 10, 2023, 10:02pm

Thanks. I’ll aim for that. I’d like to broaden the tests and generate a few more statistics first.

Topic		Replies	Views
Nonlinear Least Squares Millions of Allocations Optimization (Mathematical) jump , ipopt	9	498	June 15, 2024
Differences between NLsolve and Optim in solving system of equations Optimization (Mathematical)	24	3033	November 16, 2022
A new solver for robustified non-linear least squares problems, with non-Euclidean variables Optimization (Mathematical)	6	807	April 14, 2023
[ANN] Manopt.jl Package Announcements optimization , math , manifolds	53	7075	August 29, 2024
Optimization on a manifold Optimization (Mathematical) question	33	1897	June 21, 2021

Comparing non-linear least squares solvers

Related topics