How to fit a function to measurements with error?

Hi all,
I am currently using Measurements.jl for error propagation and LsqFit.jl for fitting functions to data. Is there a simple way to fit a function to data with errors? It would be no problem to use an other package if that makes things easier.

Thanks in advance for your help.

2 Likes

Thanks for heeding my advice and welcome to Discourse! Could you explain a bit more about what you mean by “measurements with error”? LsqFit solves Y=m(x,γ)+ϵ, where Y and x are your observations, so it naturally applies to data with errors.

3 Likes

Hello nice to see you again!

currently I am storing my data in a dataframe like this

data= DataFrame("xdata" => 
        [
        9.7 ± 0.1,
        6.6 ± 0.1,
        ...

if I try to fit it using

fit = curve_fit(lineareq, data.xdata,data.ydata, [1.,1.])

I get an error message

MethodError: no method matching Float64(::Measurement{Float64})

Ah I see - it looks like LsqFit is doing an explicit promotion to float somewhere. Not sure whether one should expect LsqFit to work with Measurements - maybe @pkofod can comment.

What do you expect to happen with the uncertainty in estimation? It looks like measurement error in the covariates (right hand side) of your equation to me, which traditionally in econometrics would just lead to attenuation bias. Or do you have different (known) degrees of uncertainties in different parts of your data set? In that case this discussion here might be of relevance, which shows the use of inverse variance weights in LsqFit:

Thanks again for the reply. The sample code you provided was very helpfull. I completely overlooked, that there is an option to weight your data. Sadly I dont know that much about statistics, so it is not easy to point out what I need.

I want something that is equal to using weighted mean instead of a mean.

Here in my linear regression example it seems I can use a weight wt = 1 ./ df.u_y .^2 to weight my y values. It would be nice to also consider the x error and if possible automatically calculate the weights for more difficult functions. I know it is much to ask, but maybe there is something like this out there.

Ill read up on this so maybe I can clarify later on

2 Likes

Polynomials work, but I agree having LsqFit just work would be great

3 Likes

Probably it needs to be taught how to weight the errors

1 Like

I think it will be tricky to teach it to do the right&sensible thing. Polynomials (i.e. linear regression) is non controversial right? Just propagate error through the matrix operations. But idk if mse minimization has some property saying “the error on fitted parameter is still 1 sigma”

Not really what is being asked, but another approach would be a Bayesian inference: defining the MvNormal corresponding to each (x_i,y_i) data point, and drawing e.g. 100 samples from that distribution, then using MCMC to estimate the model parameters on 100n “data points”. Does anyone have an idea whether this should work?

(Even better would be to use the MVNormals directly in the model definition without sampling… I don’t know if that’s possible in one of Juli’a probabilistic DSLs.)

1 Like

There is this package for dealing with outliers as well:

https://github.com/fsobral/RAFF.jl

3 Likes

Another (simpler) idea also based on samples from the distribution of each (x_i,y_i) data point: make an “ensemble fit” and calculate errors from the set of results. Basically what you would get with MonteCarloMeasurements.jl if it worked with LsqFit.jl (I assume it would fail with the same error as with Measurements.jl).

Did someone find a better version yet? Something that also plots the uncertainty of the model given by the uncertainty in the model parameters? Maybe something from SciML, they seem to have things that go into that direction: Optimization Under Uncertainty · Overview of Julia's SciML

I found something that works!

pkg> add https://github.com/chriswaudby/LsqFit.jl.git#MeasurementsExtension

For example

using LsqFit, Plots, Measurements
@. gauss(x,p) = p[1] + p[2] * exp(-(x-p[3])^2/p[4]^2)
x = range(-1,1,100)
yerr = gauss(x,[0,1,0,0.5]) + 0.1*randn(length(x)) .± 0.1
fit = curve_fit(gauss, x,yerr,Float64[0,1,0,1])
begin
	scatter(x,yerr)
	plot!(x,gauss(x,fit.param);ribbon = gauss(x,fit.param.+stderror(fit)) - gauss(x,fit.param.-stderror(fit)))
end
savefig("fit.png")

fit

1 Like