This is only a little bit of a Julia question and mostly a question about where to find materials for learning econometrics, but: I’m wondering if anyone can suggest good materials for learning about modelling techniques that are commonly used with panel data. I’m currently working with some panel data with @Nosferican 's fantastic Econometrics.jl package and I’m struggling to understand how to interpret the model outputs when I do it different ways.
For example, I’ve tabulated some statistics for 13 different industries across 11 years from the Current Population Survey and I’m now exploring some simple models that produce very different results. For now, I’m just experimenting with different models that might explain differences in median wage rates based on the median age and the percent of the workforce that consists of racial minorities in an industry. Here’s one example:
using Econometrics
using CSV
using DataFrames
cps = DataFrame(CSV.File("data/cps_panel_10_20.csv"))
model = fit(
EconometricModel,
@formula(median_wage ~ median_age + prcnt_minority),
cps
)
Continuous Response Model
Number of observations: 143
Null Loglikelihood: -366.63
Loglikelihood: -325.47
R-squared: 0.4378
LR Test: 82.32 ∼ χ²(2) ⟹ Pr > χ² = 0.0000
Formula: median_wage ~ 1 + median_age + prcnt_minority
Variance Covariance Estimator: OIM
───────────────────────────────────────────────────────────────────────────────
PE SE t-value Pr > |t| 2.50% 97.50%
───────────────────────────────────────────────────────────────────────────────
(Intercept) 6.0373 2.52437 2.3916 0.0181 1.04648 11.0281
median_age 0.380779 0.0465285 8.18378 <1e-12 0.28879 0.472769
prcnt_minority -13.9391 3.37746 -4.12709 <1e-04 -20.6165 -7.26168
───────────────────────────────────────────────────────────────────────────────
Then, if I use the absorb
option, just for the industry:
model = fit(
EconometricModel,
@formula(median_wage ~ median_age + prcnt_minority + absorb(prmjind1)),
cps
)
Continuous Response Model
Number of observations: 143
Null Loglikelihood: -366.63
Loglikelihood: -178.47
R-squared: 0.9285
Wald: 85.49 ∼ F(2, 128) ⟹ Pr > F = 0.0000
Formula: median_wage ~ 1 + median_age + prcnt_minority + absorb(prmjind1)
Variance Covariance Estimator: OIM
─────────────────────────────────────────────────────────────────────────────────
PE SE t-value Pr > |t| 2.50% 97.50%
─────────────────────────────────────────────────────────────────────────────────
(Intercept) -7.10788 3.64651 -1.94923 0.0535 -14.3231 0.107359
median_age 0.175965 0.0849071 2.07244 0.0402 0.00796144 0.343968
prcnt_minority 37.7827 2.8899 13.0741 <1e-24 32.0645 43.5008
─────────────────────────────────────────────────────────────────────────────────
The docs say that absorb
is for when you “only care about the estimates of a subset of features and controls.” When I absorb
the industry variable, does that mean that what I’m measuring is how much of the variability in median wage, within an industry, is explained by age and % minority? In other words, if I’m just interested in explaining within-industry variation (i.e. how median wage varies over time within a given industry), is that when I would use absorb(industry)
?