Alternatives to LsqFit for nonlinear curve fitting?

Hi!
I’m trying to use Julia to make a curve fit with the functional form:

model = ( p[1] / x ) * exp( -p[2] / x )

to a data-set I have.

I tried LsqFit, however, if I’m not making really good initial guesses for the parameters p[1] and p[2], the algorithm can’t find a good fit (or even remotely close one), so this kind of misses the point.

Are there alternatives to LsqFit that are more robust? Or is there a way to use LsqFit in a better way than simply calling curve_fit?

Thanks !!

Have you tried fitting the log?

Do you mean to transfer x → log10(x) within the same model structure?

What are the ranges of x and y in the data to be fitted?

1e-5 < x < 1e-3 and 0.6 < y < 0.8

I can’t reproduce your problem. For this particular model, I’m finding that LsqFit is pretty robustly finding the parameters. But maybe you are in a very different regime?

For example, I considered a ground-truth model p = [1,1], with 30 noisy data points on x \in (0,7), given by the following code:

model(x, p) = ( p[1] / x ) * exp( -p[2] / x )
p = [1,1]
xdata = rand(30) * 7
ydata = model.(xdata, Ref(p)) .+ randn.() .* 0.02

which looks like:
image

Then, if I run curve_fit from various starting points, it always seems to recover p \approx [1,1]. Even a starting guess that is way off works:

fit = curve_fit((x, p) -> model.(x, Ref(p)), xdata, ydata, [12.0, 32.0])
fit.param

gives [1.0053649510098448, 0.9865169899979824].

What does the data that you are trying to fit look like?

It might be worth plotting your least-square objective function as a function of p to see if it has multiple local minima for your data, or if it is badly conditioned (compute the condition number of the Hessian matrix at the optimum — ForwardDiff.jl or TaylorDiff.jl should have no trouble computing the Hessian of a simple two-parameter function like this).

4 Likes

First of all, thank you for the help!
Second, you are all probably right, my data falls at the very initial section of the functional form (the very small values of x where the curve starts to incline in a semi-parabolic fashion). That’s probably why it is difficult to reproduce the fit with the algorithm

1 Like

Ah, I see. If I plot the sum-of-squares error (SSE) that you are minimizing, computed by

sse(model, xdata, ydata, p) = sum(((x,y),) -> abs2(y - model(x, p)), zip(xdata,ydata))

then for my example data above (x \in (0,7)), I get:
image
which looks like it has a single well-conditioned local minimum at ≈ the ground-truth p = [1,1]. This is why least-square fitting works so well for me.

In contrast, if I use x data just in (0,0.5), before the peak of the model, then the same plot looks like:
image
i.e. there is a whole curve in p space that has nearly the same SSE as the ground-truth p = [1,1]. I don’t know if these are all local minima or if it slopes very shallowly down towards the ground-truth p, but at the very least the minimum is quite ill conditioned. In particular, we can compute the condition number of the Hessian using:

using ForwardDiff, LinearAlgebra
H = ForwardDiff.hessian(p -> sse(model, xdata, ydata, p), [1,1])
cond(H)

which gives ≈ 145 — the second derivative at the ground truth is 145x smaller in one direction than the other, which makes the location of the optimum quite sensitive to noise, and also makes optimization converge slowly at best.

That being said, I find that I can still get a decent approximation for the ground-truth optimum by lowering the tolerance on curve_fit, so that it runs for more iterations. For example,

fit = curve_fit((x, p) -> model.(x, Ref(p)), xdata, ydata, [12.0, 11.0]; maxIter=1000, show_trace=true, x_tol=0, g_tol=0)
@show fit.param

where I’ve forced it to run for 1000 iterations, converges to:

[1.2752929578813368, 1.091880904890356]

which is probably as close to the ground-truth p = [1,1] as the noise allows — again, because the minimum is badly conditioned, the noise in the problem is amplified to a large shift in in the minimum (unless you have a huge amount of data).

I’m guessing that as your x range shrinks, the condition number gets worse, so the problem is exacerbated. Still, you might still get acceptable results if you lower the tolerances / increase the iterations as I’ve done above.

(But get data from larger x if you can!)

22 Likes

Fantastic response! Very enlightening and insightful! I definitely learned a lot.
Much appreciated!

1 Like

so LsqFit.jl then is the go-to package for fitting equations like this in julia? no one who responded to this post so far has addressed this portion of the OP.

1 Like

There are plenty of other optimization packages if you want to treat least-square fitting as a general optimization problem.

But if you want to take advantage of the special structure of least-square optimization, some options currently seem to be LsqFit.jl, LeastSquaresOptim.jl, NLLSsolver.jl, and ManOpt.jl. (Update: See replies for more packages.) See also Comparing non-linear least squares solvers

In the particular problem raised in the present thread, however, the lack of robustness was intrinsic to the optimization problem being posed rather than a property of the algorithm, since the minimum was ill-conditioned.

2 Likes

LsqFit is actually mainly an API for quick use of Optim.jl for fitting. I’d also recommend GLM.jl for anything that can be linearized, which is quite common.

1 Like

I’ve found the pretty new NonlinearSolve.jl to be by far the best and easiest to use. Nonlinear Least Squares Solvers · NonlinearSolve.jl

3 Likes