Basic, technical questions on “goodness-of-fit” for regression models…
Using N data points, suppose I fit a regression model with n_\beta parameters \beta to get a predictor \hat{y}_i = \hat{\beta}\phi_i, with prediction error e_i = y_i - \hat{y}_i. The standard deviation \sigma_e is then given by:
and a corrected standard deviation s_e is given by
where essentially the subtraction of n_\beta corrects for the fact that parameters \hat{\beta} have been estimated using the same data as used for computing the standard deviation. [Trivial example… \hat{y}_i = \bar{y} where n_\beta = 1.]
Two questions:
- Suppose instead that the n_\beta parameters have been computed from training data, while I want to compute the standard deviation over validation data which are different from the training data. I’d like to use the standard deviation as a measure of “goodness-of-fit”.
- Since I didn’t use validation data to compute \hat{\beta}, would it be correct to compute the standard deviation over the validation data using the expression for \sigma_e? Or should I use the corrected expression, i.e., s_e?
- Is the correction by dividing with N-n_\beta based on an assumption of a linear regression model, or would the same idea be valid for nonlinear regression methods such as ANNs?
Sorry for bothering you with such trivial questions – I’m trying to convince some colleagues to take a look at Julia, and plan to use basic regression as a case study for them. (I’d like to understand what is going on in packages before I use them…)