My use case is extremely simple. Peak lineshape fitting to experimental data. Error bars on the data are known. A minimal example of a simple peak fit is shown below when working with python
and lmfit
:
"""
lorentzian.py
Generate noisy data that is Lorentzian in nature, fit a Lorentzian to the
data and report parameter uncertainties using LMFIT
"""
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model
np.random.seed(777)
def lorentzian(x, a, c, s):
L = (a / np.pi) * (s / ((x - c)**2 + s**2))
return L
"""true parameters"""
a = 10.0
c = 0.0
s = 0.02
x_data = np.linspace(-0.5, 0.5, 101)
noise = np.random.normal(0, 7, len(x_data))
y_data = lorentzian(x_data, a, c, s) + noise
dy_data = np.sqrt(y_data - noise) # assumed poissonian noise
data_xye = np.array([x_data, y_data, dy_data]).T
np.savetxt("lorentzian_peak.dat", data_xye, delimiter="\t")
"""fit with lmfit"""
model = Model(lorentzian)
model.nan_policy = "omit"
params = model.make_params(a=17.0, c=0.0, s=0.04)
fit_result = model.fit(y_data, params, x=x_data, weights=1.0/dy_data)
fit_ax = fit_result.plot_fit(numpoints=701, fit_kws={"lw":3})
fit_fig = plt.gcf()
fit_fig.suptitle("Lorentzian fit with LMFIT")
fit_fig.set_size_inches(11.7, 8.3)
"""fit statistics"""
fit_stats = fit_result.fit_report()
fit_ax.text(0.05, 0.5, fit_stats, transform=fit_ax.transAxes)
plt.tight_layout()
plt.savefig("lmfit_fit.pdf", dpi=300)
plt.show()
An analogous work flow in Julia is as follows:
"""
lorentzian.jl
Fit a Lorentzian using Optimization.jl
"""
using Optimization, OptimizationNLopt
using Plots
default(size=(1080, 720))
default(margin=5Plots.mm)
default(linewidth=3.5)
function lorentzian(x::Real, a::Real, c::Real, s::Real)::Float64
(a / pi) * (s / ((x - c)^2 + s^2))
end
function chisqr_lorentzian(u, p)::Float64
x_data = p[1]
y_data = p[2]
dy_data = p[3]
y_calc = [lorentzian(x, u...) for x in x_data]
sum( ((y_data .- y_calc) ./ dy_data) .^ 2 )
end
"""load data generated with python"""
dfile = open("lorentzian_peak.dat", "r")
x_data = Float64[]
y_data = Float64[]
dy_data = Float64[]
for line in eachline(dfile)
parts = split(line)
push!(x_data, parse(Float64, parts[1]))
push!(y_data, parse(Float64, parts[2]))
push!(dy_data, parse(Float64, parts[3]))
end
u0 = Float64[17, 0, 0.04]
data_xye = [x_data, y_data, dy_data]
optfunc_lor = OptimizationFunction(chisqr_lorentzian)
optprob_lor = OptimizationProblem(optfunc_lor, u0, data_xye)
sol = solve(optprob_lor, NLopt.LN_NELDERMEAD())
s_names = ["amplitude", "center", "sigma"]
sol_str = [n*": "*string(u) for (n, u) in zip(s_names, sol)]
push!(sol_str, "FWHM: $(2*sol[3])")
p = plot(xlabel="x", ylabel="y", title="Lorentzian fit with Optimization.jl")
scatter!(p, x_data, y_data, yerr=dy_data, label="data")
plot!(p, x_data, [lorentzian(x, sol...) for x in x_data], label="fit")
annotate!(p, (-0.3, 100, join(sol_str, "\n")), annotatiohnalign=:left)
savefig(p, "optimization_fit.pdf")
The fit results naturally agree. Of course the added benefit of lmfit
is the calculated CI of the fit parameters from scipy
´s covariance matrix computation.
For my use case, in principle the Hessian can be calculated since for most lineshapes I work with, analytical expressions for the derivative of the model w.r.t. the fit parameters are straightforward to calculate.
I understand Optimization.jl
is not a statistics package per se, but it strikes me as odd that it optimizes very well but then it does not provide confidence interval for the optimized parameters (even in the Gaussian approximation) nor any goodness-of-fit metric beyond the value of the objective function at the found minimum.
I also understand that for more complex cases pushing a “+/-” on the users’ fit results willy-nilly is dangerous. However I would contend that most use cases are not all use cases.
This exact surprise is what I felt and what prompted me to start this conversation topic in the first place. And while LsqFit
is a counter-example, it’s still quite disheartening to have to give up the whole amazing machinery of Optimization.jl
just because you want an error bar on your parameters post-fit.