Ah, I see. If I plot the sum-of-squares error (SSE) that you are minimizing, computed by
sse(model, xdata, ydata, p) = sum(((x,y),) -> abs2(y - model(x, p)), zip(xdata,ydata))
then for my example data above (x \in (0,7)), I get:
which looks like it has a single well-conditioned local minimum at ≈ the ground-truth p = [1,1]. This is why least-square fitting works so well for me.
In contrast, if I use x data just in (0,0.5), before the peak of the model, then the same plot looks like:
i.e. there is a whole curve in p space that has nearly the same SSE as the ground-truth p = [1,1]. I don’t know if these are all local minima or if it slopes very shallowly down towards the ground-truth p, but at the very least the minimum is quite ill conditioned. In particular, we can compute the condition number of the Hessian using:
using ForwardDiff, LinearAlgebra
H = ForwardDiff.hessian(p -> sse(model, xdata, ydata, p), [1,1])
cond(H)
which gives ≈ 145
— the second derivative at the ground truth is 145x smaller in one direction than the other, which makes the location of the optimum quite sensitive to noise, and also makes optimization converge slowly at best.
That being said, I find that I can still get a decent approximation for the ground-truth optimum by lowering the tolerance on curve_fit
, so that it runs for more iterations. For example,
fit = curve_fit((x, p) -> model.(x, Ref(p)), xdata, ydata, [12.0, 11.0]; maxIter=1000, show_trace=true, x_tol=0, g_tol=0)
@show fit.param
where I’ve forced it to run for 1000 iterations, converges to:
[1.2752929578813368, 1.091880904890356]
which is probably as close to the ground-truth p = [1,1] as the noise allows — again, because the minimum is badly conditioned, the noise in the problem is amplified to a large shift in in the minimum (unless you have a huge amount of data).
I’m guessing that as your x range shrinks, the condition number gets worse, so the problem is exacerbated. Still, you might still get acceptable results if you lower the tolerances / increase the iterations as I’ve done above.
(But get data from larger x if you can!)