Nonlinear reduced chi-square fit with error erstimates and correlation matrix

Hi all,

I have a nonlinear function

function model(params, args)
# params define a matrix which is then diagonalized
# output are eigenvalues and some function of the eigenvectors
return out1, out2
# both outputs out1,out2 are arrays of size 14x1 
# the size can be 14xn, depending on args
end

for which I would like to fit the values of params, such that a reduced chi2-square cost function
\chi^2 = \frac{1}{N_{\rm dof}}\sum \limits_{i=1}^{14} \sum \limits_{j=1}^{2} \frac{(out_{j,{\rm expt},i}-out_{j,{\rm calc},i})^2}{(\sigma_{j,{\rm expt},i}+\sigma_{j,\rm mod})^2}

is minimized. In other words, the model takes an array of parameters as input and calculates two observables out1, out2 (both are arrays of size 14), which are then compared to experimental values out1_expt, out2_expt. I consider uncertainties of the experimental values, \sigma_{\rm expt}, as well as of the model \sigma_{\rm mod}. In fact \sigma_{\rm mod} is adjusted manually to achieve roughly \chi^2\simeq 1. N_{\rm dof} is the number of calculated values (2x14=28 here) minus the number of parameters.

So far, I have written my own function to calculate the chi-squared as given by the formula above, and then minimized using the Optim.jl package. This works, however, I would also like to have error estimates for the optimal parameter values, as well as a correlation matrix.

In principle, the background is high-energy physics, and there exists already a program called “MINUIT” that would do the job (and much more…). There exists also a Julia wrapper for MINUIT, but in fact it is a Julia wrapper for a Python wrapper for a C++ pacakge, which unfortunately leads to some problems on the workstation which I am running on (I dont have admin rights, and the OS is outdated).

Is there any Julia-package that does the job out of the box? I checked out LsqFit.jl, which calculates all necessary outputs (error estimates,…) , but as far as I understand it is not capable of taking into account the \sigma_{{\rm expt}} and \sigma_{\rm mod} of my case.

One solution I see is to extract the code snippets for the error and covariance matrix from LsqFit.jl and adapt it to my code, but maybe there is already some statistics package that does all of it? Or maybe there is a way to tell LsqFit to consider the experimental and model uncertainties?

Best regards

Sorry for the double-posting…

After reading the documentation more carefully, I found that the effect of the uncertainties \sigma can be taken care of by a “Weighted Least-squares fit” Tutorial · LsqFit.jl. I was not aware of that from just looking at the GitHub page, this is my fault.

Nevertheless, I have another question which is still open to me:
Is it possible to consider that e.g. for a given observable out_j there exist more than one data point from measurements? In that sense, in the formula of \chi^2 above, there would be another (inner) sum over an index k, which would label the amount of measurement data points available for each observable out_j_i. Somehow I cannot get my head around how I would do that.

Ok, I have managed to get it work. After more reading, I realized that the feature of (arbitrary) many observables and array lengths (i_max=14, j_max=2) as needed in my case does not really matter, as long as the output of my function is an array or vector of the total length. Then this dimensionality of the array takes the role of “xdata” as described in the example of LsqFit.jl.

For my case it was however simpler to extract the relevant code snippets from the package and adjust them to my current code instead of changing my code to be directly usable for the package. Maybe this means that my code should be improved, but thats how it is now… :slight_smile:

Also I added the calculation of the Pearson’s product-moment correlation coefficient (Correlation - Wikipedia)

corr = covar./(stderror'.*stderror)

as this is often needed in my field. I will suggest this addition to the pacakge developer.