How do I get lower and upper bounds for a confidence band using LsqFit.jl?

I’m fitting a three parameters non-linear function using LsqFit.jl. I know I can get confidence intervals using confidence_interval(fit), but how do I actually get lower and upper bound for the confidence band? I believe GLM.jl has the lower and upper properties of the prediction. Does LsqFit have something similar?

confidence_interval(fit) provides the confidence interval for the parameters. Are you trying to obtain the confidence interval on the predictions instead? If so, I didn’t see a predict function in the package.

Basically, I’m trying to understand how to draw a confidence band over the fitted function on a plot. I searched online and I didn’t found a lot of info, except other programs or languages that do this automatically, and an example using GLM.

It appears that there is a pull request for this feature, but I am not sure why it has not been merged.

Assuming your residuals are normally distributed and homogenous, a naive approach could involve computing the standard deviation of the residuals:

σ = std(fit.resid)
error = quantile(Normal(0, σ), [.025,.975])

I think you can pass error to yerror in Plots, but I don’t know if that would be the best approach.

I believe I’ve found the answer. That pull request links this answer which explains how to get confidence bands for each data point using the Jacobian and the covariance matrix (readily available in LsqFit.jl). I’ve written a function that calculates this:

function confidence_bands(fit::LsqFit.LsqFitResult, alpha = 0.05; dist::Symbol = :t)
	if !(dist in [:t, :normal])
		throw(ArgumentError("'dist' argument must be either :t or :normal, got :$dist"))
	end
	
	J = fit.jacobian
	sqrtN = sqrt(size(J, 1)) # sqrt of number of samples
	cov = LsqFit.estimate_covar(fit)
	σ = sqrt.([r' * cov * r for r in eachrow(J)]) # std dev at each point
	z = if dist == :t
		Distributions.quantile(Distributions.TDist(LsqFit.dof(fit)), 1 - alpha / 2)
	else dist == :normal
		Distributions.quantile(Distributions.Normal(), 1 - alpha / 2)
	end

	z .* σ ./ sqrtN
end

Note that I’ve avoided using the normalized covariance matrix, using the non-normalized one instead, per this answer.

2 Likes

Depending if you want confidence intervals for the conditional mean, or for the prediction of a single response, conditional on the regressors, you need to use a different formula. See the third answer here (with 79 votes): regression - Difference between confidence intervals and prediction intervals - Cross Validated

It’s just a matter of adding adding a sigma^2 hat to the estimated variance, or not.

2 Likes

By the time you’re doing least squares on nonlinear fits and asking for prediction intervals, you might as well go straight to Turing.jl and create a proper Bayesian model and sample it. Least Squares is essentially maximum likelihood for a gaussian likelihood. Adding some priors for your parameters and return some generated quantities for predicted data points and you’re all set.

1 Like

How fast should I expect something like this to be? I always get the impression that without a bag full of optimization tricks I’m looking at minutes to hours of samping vs. < seconds for frequentist inference.

Depends on how complicated the model is. But for common stuff you’d do least squares on it can be a few seconds or a minute.

Generally the big question is how many parameters do you have? 5-10 you might do it in seconds, 100-500 you may be talking minutes or tens of minutes, 1000-10000 you might be talking hours to days.

also note that

Is usually only true when you’re talking linear models and the solution can be found with matrix algebra. When you’re talking nonlinear models you’re using an optimization algorithm and the amount of time has to do with the optimization method and whether it converges etc, so the problems are more similar to sampling.

3 Likes