Hi all,
I am currently using Measurements.jl for error propagation and LsqFit.jl for fitting functions to data. Is there a simple way to fit a function to data with errors? It would be no problem to use an other package if that makes things easier.
Thanks for heeding my advice and welcome to Discourse! Could you explain a bit more about what you mean by “measurements with error”? LsqFit solves Y=m(x,γ)+ϵ, where Y and x are your observations, so it naturally applies to data with errors.
Ah I see - it looks like LsqFit is doing an explicit promotion to float somewhere. Not sure whether one should expect LsqFit to work with Measurements - maybe @pkofod can comment.
What do you expect to happen with the uncertainty in estimation? It looks like measurement error in the covariates (right hand side) of your equation to me, which traditionally in econometrics would just lead to attenuation bias. Or do you have different (known) degrees of uncertainties in different parts of your data set? In that case this discussion here might be of relevance, which shows the use of inverse variance weights in LsqFit:
Thanks again for the reply. The sample code you provided was very helpfull. I completely overlooked, that there is an option to weight your data. Sadly I dont know that much about statistics, so it is not easy to point out what I need.
I want something that is equal to using weighted mean instead of a mean.
Here in my linear regression example it seems I can use a weight wt = 1 ./ df.u_y .^2 to weight my y values. It would be nice to also consider the x error and if possible automatically calculate the weights for more difficult functions. I know it is much to ask, but maybe there is something like this out there.
Ill read up on this so maybe I can clarify later on
I think it will be tricky to teach it to do the right&sensible thing. Polynomials (i.e. linear regression) is non controversial right? Just propagate error through the matrix operations. But idk if mse minimization has some property saying “the error on fitted parameter is still 1 sigma”
Not really what is being asked, but another approach would be a Bayesian inference: defining the MvNormal corresponding to each (x_i,y_i) data point, and drawing e.g. 100 samples from that distribution, then using MCMC to estimate the model parameters on 100n “data points”. Does anyone have an idea whether this should work?
(Even better would be to use the MVNormals directly in the model definition without sampling… I don’t know if that’s possible in one of Juli’a probabilistic DSLs.)
Another (simpler) idea also based on samples from the distribution of each (x_i,y_i) data point: make an “ensemble fit” and calculate errors from the set of results. Basically what you would get with MonteCarloMeasurements.jl if it worked with LsqFit.jl (I assume it would fail with the same error as with Measurements.jl).
Did someone find a better version yet? Something that also plots the uncertainty of the model given by the uncertainty in the model parameters? Maybe something from SciML, they seem to have things that go into that direction: Optimization Under Uncertainty · Overview of Julia's SciML