The work in progress is the following:
- A separate Regression / Econometrics environment that builds on the IO (CSV/Feather) + DataFrames + StatsBase + StatsModels + GLM
- It will have a suite of various packages that provide more functionality
- A utility package for various transformations and helpers: generalized within transformation (absorbs fixed effects and handles singletons), first-difference transformation, between estimator, two stage estimators, subset linear independent predictors, etc.
- Intermediate package for computing the distance and kernel auto-tuners for correlation structures which will then be used to provide Sandwich estimators (multi-way clustering, SHAC, HAC, HC, etc.)
- The covariance matrices package for sandwich estimators and bootstrapping
- Regression Hypothesis Tests and Diagnostics: StatsBase will host Wald test, LR, and score tests. Hypothesis testing for various tests will construct the according hypothesis test (Wald test, robust Hausman, etc.)
DataFrames / StatsBase / StatsModels / GLM have been updated so now is a matter of unifying the various packages and finish the implementation of the missing features.
A few comments: in the future
DataFrameRegressionModel will probably be depreciated in favor of inheritance from
StatsBase.RegressionModel. Covariance matrices will be able to work with all
RegressionModel rather than
GLM.GeneralizedLinearModel with minimal effort.