Does GLM return an estimate for the error term (residual standard deviation)?

I was under the impression that if I called lm(@formula(y~x),data) from the GLM.jl package was fitting a simple linear regression of the form:

y=a+bx_i+\epsilon_i

The results from the regression being:

\hat{y}=\hat{a}+\hat{b}x

Which is our linear predictor. But we want to predict a value we have to use:

\hat{y}=\hat{a}+\hat{b}x+\epsilon_i

Why doesn’t lm return an estimate for error (\epsilon_i) in the summary of coefficients? Isn’t this an important part of the regression analysis? R returns the estimate as the auxiliary parameter. For a least squares regression wouldn’t this estimate just be the standard deviation of the residuals? Is there any way to access this information without just creating a helper function to calculate that value on my own? Im probably just missing something very dumb!

GLM.jl doesn’t report it as part of the model summary, but you can extract it for a model with GLM.dispersion.

If you think that this should be part of the default output, please open an issue.

Also, if you want to predict a value, see GLM.predict, which can also provide prediction intervals, taking the various uncertainties in the model into account.

3 Likes

This is exactly what I was looking! I guess I should have looked more closely at the documentation, but thank you for pointing it out anyway :slight_smile: