Extract model coefficients with FixedEffectModels.jl package

Antonina_Klyuyeva · October 28, 2020, 1:07pm

Hi everyone!
In my regression model (Y ~ A:B), a numeric variable (A) interacts with a categorical variable (B). Since the categorical variable has a lot of unique levels, fitting the model using GLM.jl package consumes a lot of RAM. I used the FixedEffectModels.jl package and it looks much better! However, I have difficulties in extracting model coefficients such as name of the effects, estimates, stderr, p_value, t_value, etc. and also with residuals and predict. I read the documentation here and here, but I didn’t see the functions I needed
When building the model, I selected the option save = true, which saves the residuals and fixed effects estimates. When I extract fe(model), I get a table like this:

julia> fe(model)
444378×1 DataFrame
│ Row    │ fe_B&A │
│        │ Float64                │
├────────┼────────────────────────┤
│ 1      │ -1.47252               │
│ 2      │ -1.47252               │
⋮
│ 444376 │ -0.610356              │
│ 444377 │ -0.610356              │
│ 444378 │ -0.610356              │

How can I also get names of the effects to match them with the fixed effects estimates? (because, I see only row numbers)
Also, how can I get other coefficients (stderr, p_value, t_value, etc)?

Thanks a lot for your help!

pdeffebach · October 28, 2020, 1:17pm

All of the functions here should be implemented: Abstraction for Statistical Models · StatsBase.jl

Albert_Zevelev · October 28, 2020, 2:10pm

I also wasn’t able to find the FE names.
I don’t think it’s currently possible.

@matthieu?

Ps: When you have High dimensional FE, the FEs become hard to interpret anyway, users rarely try.

Antonina_Klyuyeva · October 28, 2020, 2:11pm

Thanks for the advice!
I have already tried most of these functions, but did not get the acceptable results.
For example:

coefnames()

julia> coefnames(model)
1-element Array{String,1}:
 "(Intercept)"

but I need names of fixed effects too (not Intercept only).

coef()

 julia> coef(model)
1-element Array{Float64,1}:
 2.821248194859784

does not return coefficients for fixed effects.

stderror()

julia> stderror(model)
1-element Array{Float64,1}:
 0.027292973948148374

returns result for intercept only (fixed effects needed as well).

residuals()

julia> residuals(model)
444378-element Array{Union{Missing, Float64},1}:
 -0.4949565332014614
 -0.38919178650364045
 -0.4186431448593252
 -0.2590869635253943
 -0.17627291925168315
  ⋮
 -0.34583852183630376
 -1.5652955707355511
 -1.1542833700855635
 -2.1144655963560104

how can I match values with coefficient names?

predict()

julia> predict(fit)
ERROR: predict is not defined for FixedEffectModel.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] predict(::FixedEffectModel) at /home/antonina_kliuieva/.julia/packages/StatsBase/EA8Mh/src/statmodels.jl:368
 [3] top-level scope at REPL[19]:1

have error
etc.

So, my question is still open…
Sorry, I’m new to Julia, maybe I’m missing something

pdeffebach · October 28, 2020, 2:19pm

No, you are not missing anything unfortunately. There is an issue here filed last week to make prediction easier with fixed effects.

With regards to the values of the fixed effects, I think that the only solution is to

hcat the vector of intercepts with your data
Keep just the columns of interest, i.e. B and A
Call unique on the data frame
Work with the results of that to match combinations of :B and :A to intercepts

It would be nice if obtaining the value of the intercepts were as easy as in reghdfe in Stata or fixest in R. Hopefully we can build more UX polish into the package in the future.

Antonina_Klyuyeva · October 28, 2020, 3:03pm

Ummm, the package created to evaluate high-dimensional fixed effect variables doesn’t provide an easy way to extract estimates for these fixed effects?
Maybe you know - the order of fe definition in fe(model) DataFrame is the same as the original data set? Сan I expect, for example, that estimate -1.47252 (first row from the dataframe below) corresponds to effect that is in the first row of the input data?
That is, can I join the fe dataframe with input data by Row?

julia> fe(model)
444378×1 DataFrame
│ Row    │ fe_B&A │
│        │ Float64                │
├────────┼────────────────────────┤
│ 1      │ -1.47252               │
│ 2      │ -1.47252               │
⋮
│ 444376 │ -0.610356              │
│ 444377 │ -0.610356              │
│ 444378 │ -0.610356              │

Thank you!

danicaratelli · October 28, 2020, 3:05pm

There should be a “save=true” option. Add that at the end of the regression function and then you should be able to do everything. Something like this:

reg_res = reg(df, @formula(X ~ Y + Z + fe(K) ),save=true);
residuals(reg_res)

Antonina_Klyuyeva · October 28, 2020, 3:13pm

I have the save=true option enabled, but most functions don’t give the same result as GLM (please, see my reply to @pdeffebach above).

For example, I get an array like this:

julia> residuals(model)
444378-element Array{Union{Missing, Float64},1}:
 -0.4949565332014614
 -0.38919178650364045
 -0.4186431448593252
 -0.2590869635253943
 -0.17627291925168315
  ⋮
 -0.34583852183630376
 -1.5652955707355511
 -1.1542833700855635
 -2.1144655963560104

How can I match each value with the name of effect for which it is calculated?

Albert_Zevelev · October 28, 2020, 3:22pm

Yes.
Stata xtreg reports estimates & stats for fe
STATA reghdfe does not for “absorbed FE”

pdeffebach · October 28, 2020, 3:32pm

yes, in general economists don’t care about the values of the fixed effects. We just want to use “within-unit variation”, hence the lack of emphasis on analyzing these fixed effects.

yes, just use hcat(df, fe(model)). That will work.

Antonina_Klyuyeva · October 28, 2020, 3:43pm

Thanks a lot @pdeffebach, I’ll try this.
If I understand correctly, at the moment there is no way to obtain stderr, t_value, p_value for A:fe(B)?

pdeffebach · October 28, 2020, 3:52pm

No. My understanding is that this is why it converges so fast. By not calculating (or even materializing) the FE, you can estimate things faster and with less memory.

Antonina_Klyuyeva · October 28, 2020, 5:03pm

Ok, thank you!

Antonina_Klyuyeva · October 28, 2020, 5:06pm

One more question (maybe someone knows) - will this issue (with predict function) be taken into account in the near future? I just would like to understand whether it is value to wait or is it better to focus on another package.
Thanks!

nilshg · October 28, 2020, 6:26pm

You could extend my hack in the issue to take account of the interactions in your model (you might have noticed if you tried that why I put in the issue comment breaks down if there are interactions in the model)

Overall it might be helpful to understand a bit better what you are trying to achieve - as Peter says, fixed effects models are heavily used by economists (and the author of FixedEffectsModels is an economist as well) who mostly use them to get closer causal identification of marginal effects (by controlling for unobserved, time-invariant heterogeneity), and because of that don’t actually care about the fixed effects themselves. It appears that your use case is different, so a different approach might be warranted.

Topic		Replies	Views
Regression implementation using FixedEffectModels.jl, InteractiveFixedEffectModels.jl, GLFixedEffectModels.jl packages Machine Learning package , regression	18	1391	November 2, 2020
[FixedEffectModels.jl] Getting strange MethodError on fresh install Statistics package , regression	2	600	February 15, 2022
My manual Fixed/Random Effects Model perform weirdly New to Julia	0	171	December 16, 2023
How to report intercept iterm in FixedEffectModels? General Usage	3	204	June 27, 2022
Getting an error for FixedEffectModels General Usage	12	779	February 18, 2022

Extract model coefficients with FixedEffectModels.jl package

Related topics