Python lmfit takes weights into account, but Julia LsqFit package seems not to

Hello everyone.

Using Python I have been able to do non-linear fits with lmfit creating a residual function for a potential mathematical function to fit, just as follows


def pow_res(pars, E, data, sig):
    values = pars.valuesdict()
    a = values["a"]
    C = values["C"]   
    model = C/E**a
    
    residual = model-data
    chi_sqr = np.divide(residual**2,sig**2)
    
    return chi_sqr

params = Parameters()
params.add("a", value = a0)
params.add("C", value = C0)

I can minimize this function and obtain the pair of parameters a and C, minimizing the chi_sqr value where I add the weights of the data . I do this with two kind of weights. In the first case I set weights to have value one to all the data and I obtain the next fit

image

which has sense because the weight of the big values is more important than the weight of the lower values, making the fit to go crazy over low values. Then I do the second case: set the weights as half the magnitude of each data value, giving it now the same importance to every value in the data, obtaining the next fit

image

This was all made with Python. Now, trying the same with Julia I am trying to use the equivalent library to make nonlinear fits, which would be LsqFit.jl. With this library I am doing the same process that I made with python, working with the two cases I mentioned above. At the beginning I obtain the same fitting result with Julia than with Python when I set the weights to one, as you can see in the figure below

However, when I test the second case, where I set the weights to half the value of the data, then I obtain the same fitting result than when I have the weights set to one, which, as you saw in the Python result, has no sense

The code I am running to do this is

a0 = (log(absortion[1])-log(last(absortion)))/(log(last(energy))-log(energy[1]))
C0 = absortion[1]*energy[1]^a0
p0 = [C0, a0]

model(E, p) = p[1]./(E.^p[2])

synth_wt = 0.5 .*absortion

fit = lsq.curve_fit(model, energy, absortion, synth_wt, p0)
params = msr.measurement.(fit.param, lsq.stderror(fit))

where synth_wt are the weights I sinthetically give to the data, absortion are the values in the y axis, energy the values of the x axis, model is the potential function I am trying to fit, and p0 are the values of initialization of the curve fitting.

I don’t understand why this is happening if I am following the same example of giving the weights to the fitting function that is shown in the LsqFit.jl documentation. If someone knows better, it would be a great help.

Thanks in advance!

Does it always give the same answer? The fitting procedure is Levenberg Marquardt which is iterative and may only find a local minimum. Perhaps try a different starting point several times just to rule out this issue.

Actually, my initialization values give a curve more or less close to the one I should obtain with half-value weights.

However I tried changing the initialization values to half the ones I used and I obtained the same result again.

Yes seems like the weights are not being taken into account. What version of LsqFit.jl are you using?

Just in case, the weights provided to LsqFit.jl should be inverse variances. See this other thread.

1 Like

Thank you all. It was just as rafael.guerra said, the algorithm receives the weights as the inverse variances. Thank you all for your help!


l

The documentation says it’s minimizing sum(w_i*err_i) if you want it to take errors more seriously on the larger values you’d think that larger w would do it. So is there a problem with the documentation?

The Weighted Least Squares section of the documentation tutorial says: “We know the error variance and we set the weight as the inverse of the variance…”.

So is the tutorial at odds with the mathematical expressions in the docs? I’m on my phone it’s hard to figure out what’s what.

Intuitively if it’s minimizing the sum of w*err and we make w bigger this means that error is more important. The OP tries to weight each error by 10% of the value of the measurement. So bigger measurements get more weight. It should force the fit to go closer to the large measurements… But it doesn’t. What is the actual math being carried out?

No, it doesn’t look like that.

From dimensional analyses, you can see that the weights wi in the following expression in the docs should have the same units as the inverse of a variance:

And intuitively, regardless of the magnitude of the fitting errors at some data points, if their uncertainties are large then their respective weights should be small.

1 Like