Hi everyone!
In my regression model (Y ~ A:B), a numeric variable (A) interacts with a categorical variable (B). Since the categorical variable has a lot of unique levels, fitting the model using GLM.jl package consumes a lot of RAM. I used the FixedEffectModels.jl package and it looks much better! However, I have difficulties in extracting model coefficients such as name of the effects, estimates, stderr, p_value, t_value, etc. and also with residuals and predict. I read the documentation here and here, but I didnโt see the functions I needed
When building the model, I selected the option save = true, which saves the residuals and fixed effects estimates. When I extract fe(model), I get a table like this:
How can I also get names of the effects to match them with the fixed effects estimates? (because, I see only row numbers)
Also, how can I get other coefficients (stderr, p_value, t_value, etc)?
how can I match values โโwith coefficient names?
predict()
julia> predict(fit)
ERROR: predict is not defined for FixedEffectModel.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] predict(::FixedEffectModel) at /home/antonina_kliuieva/.julia/packages/StatsBase/EA8Mh/src/statmodels.jl:368
[3] top-level scope at REPL[19]:1
have error
etc.
So, my question is still openโฆ
Sorry, Iโm new to Julia, maybe Iโm missing something
No, you are not missing anything unfortunately. There is an issue here filed last week to make prediction easier with fixed effects.
With regards to the values of the fixed effects, I think that the only solution is to
hcat the vector of intercepts with your data
Keep just the columns of interest, i.e. B and A
Call unique on the data frame
Work with the results of that to match combinations of :B and :A to intercepts
It would be nice if obtaining the value of the intercepts were as easy as in reghdfe in Stata or fixest in R. Hopefully we can build more UX polish into the package in the future.
Ummm, the package created to evaluate high-dimensional fixed effect variables doesnโt provide an easy way to extract estimates for these fixed effects?
Maybe you know - the order of fe definition in fe(model) DataFrame is the same as the original data set? ะกan I expect, for example, that estimate -1.47252 (first row from the dataframe below) corresponds to effect that is in the first row of the input data? That is, can I join the fe dataframe with input data by Row?
There should be a โsave=trueโ option. Add that at the end of the regression function and then you should be able to do everything. Something like this:
reg_res = reg(df, @formula(X ~ Y + Z + fe(K) ),save=true);
residuals(reg_res)
yes, in general economists donโt care about the values of the fixed effects. We just want to use โwithin-unit variationโ, hence the lack of emphasis on analyzing these fixed effects.
yes, just use hcat(df, fe(model)). That will work.
No. My understanding is that this is why it converges so fast. By not calculating (or even materializing) the FE, you can estimate things faster and with less memory.
One more question (maybe someone knows) - will this issue (with predict function) be taken into account in the near future? I just would like to understand whether it is value to wait or is it better to focus on another package.
Thanks!
You could extend my hack in the issue to take account of the interactions in your model (you might have noticed if you tried that why I put in the issue comment breaks down if there are interactions in the model)
Overall it might be helpful to understand a bit better what you are trying to achieve - as Peter says, fixed effects models are heavily used by economists (and the author of FixedEffectsModels is an economist as well) who mostly use them to get closer causal identification of marginal effects (by controlling for unobserved, time-invariant heterogeneity), and because of that donโt actually care about the fixed effects themselves. It appears that your use case is different, so a different approach might be warranted.