Econometrics.jl vs. R's plm

I’m working through the Econometrics With R online book, specifically the example in 10.3 on fixed effects regression, but I’m trying to follow along with Econometrics.jl.

Here’s the full code to reproduce the example:

using DataFramesMeta
using Econometrics
using RCall

df = rcopy(R"

df = @chain @rget(Fatalities) begin
    @transform(fatal_rate = :fatal ./ :pop .* 10_000)

model = fit(
    @formula(fatal_rate ~ beertax + absorb(state)),

# Outputs:

Continuous Response Model
Number of observations: 336
Null Loglikelihood: -287.50
Loglikelihood: 105.99
R-squared: 0.9050
Wald: 12.19 ∼ F(1, 287) ⟹ Pr > F = 0.0006
Formula: fatal_rate ~ 1 + beertax + absorb(state)
Variance Covariance Estimator: OIM
                 PE         SE      t-value  Pr > |t|     2.50%     97.50%
(Intercept)   2.37707   0.0969699  24.5135     <1e-71   2.18621   2.56794
beertax      -0.655874  0.18785    -3.49148    0.0006  -1.02561  -0.286135

The book shows the output from R’s plm package and there are some differences:

#>         Estimate Std. Error t value Pr(>|t|)  
#> beertax -0.65587    0.28880  -2.271  0.02388 *

Note that:

  1. There is no intercept
  2. The beertax coefficient is the same but the standard error and subsequent t and p values are different

Can anyone explain the difference? FixedEffectModels.jl produces an output that is much closer to the plm output:

using FixedEffectModels

reg(df, @formula(fatal_rate ~ beertax + fe(state)), Vcov.cluster(:state))

                          Fixed Effect Model
Number of obs:                 336   Degrees of freedom:              2
R2:                          0.905   R2 Adjusted:                 0.905
F-Stat:                    5.05015   p-value:                     0.029
R2 within:                   0.041   Iterations:                      1
fatal_rate |  Estimate Std.Error  t value Pr(>|t|) Lower 95%  Upper 95%
beertax    | -0.655874  0.291856 -2.24725    0.025  -1.22998 -0.0817668

First, in terms of the intercept. There is no consistency between packages on what exactly the intercept means in fixed effect models. Since using fixed effects is equivalent to adding an indicator variable for each category, the most common method is to somehow average across these different variables. Therefore, depending on who you ask, the intercept does or does not have meaning.

For the difference in standard errors, it looks like there is a difference in the assumptions made. Economectrics.jl seems to be just using normal standard errors, these are typically the smallest. plm, at least according to the book chapter you cited, is using robust standard errors, which allow for heteroskedasticity in the model (basically, the variance in the error term is not consistent). FixedEffectModels is using clustered standard errors, which allow for errors to be correlated across observations.

Try rerunning the FixedEffectModels as:

reg(df, @formula(fatal_rate ~ beertax + fe(state)), Vcov.robust())

Which should produce the same result.


Just adding a couple notes to @junder873 answer. The intercept when using the within estimator can be arbitrary and without a consistent interpretation depending on how it is computed. Econometrics.jl approach follows the principal that when possible, the intercept reported should match the one you would get if you had run the dummy variables model. In that sense, the intercept keeps the same value and interpretation it would have regardless of whether you estimate the model with all the features or absorb them.

In terms of the variance covariance estimators, it is now known that using cluster-robust vce is a bad default since it really depends on the assumed sampling design and whether we believe the treatment assignment is correlated with the sampling. In other words, just because we have “groups”, it doesn’t mean we have clusters (i.e., groups with different marginal effects) and that the sampling design causes clusters to have non-representative representations. If we do believe that is the case, you can choose a cluster robust variance covariance estimator that’s more appropriate. Looking at the current version, I believe the default is the observed information matrix (assumes White noise). Then you have your heteroscedasticity consistent options HC0, HC1, HC2, HC3, HC4. I can’t recall why I removed the cluster robust one… At some point, I think maybe I opted to incorporate GitHub - gragusa/CovarianceMatrices.jl: Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation for Julia. and never did for Arrellano/Rogers CRVE… I probably should have added those back as the extended HC1 with some residual degree adjustment heuristics.

1 Like