[ANN] RobustModels: Robust linear regression using M-Estimators and more

I would like to present RobustModels.jl, a new package
to perform robust linear regressions using M-Estimator, MM-Estimator and τ-Estimator, but also quantile regression, MQuantile regression and robust Ridge regression.
It can also be used to calculate robust estimate of the mean and variance of a vector.
It can also be used to detect outliers, similar to LinRegOutliers.jl.

It is originally based on RobustLeastSquares.jl and it is using the same interface as GLM.jl for a better reusability.

You can find the documentation here: Home · RobustModels
The package is not registered yet, so install with:

] add https://github.com/getzze/RobustModels.jl

The package is now registered, install with:

] add RobustModels


thank you very much for this package. I have one question upfront before I start my investigation of your package.
The Standard Deviation does not only reflect instability in terms of fluctuations it aslo reflects the slope (represented by the linear regression).
To decouple both: a) oscilation / fluctuation, b) drift/slope
I have in mind a new variant of standard deviation, in the first step I calculate the slope by means
of liear regression, in the second step I remove the slope of the investigated points (keeping the
algebraic mean constant) and in the 3rd step I calculate the slope corrected standard deviation.
I this a resonable approach? How would this be calculated by means of your module?

If you don’t have (or don’t care) about outliers in your data, you don’t need to use a robust linear regression, GLM.jl (GitHub - JuliaStats/GLM.jl: Generalized linear models in Julia), actually just lm is enough.
Linear regression is doing exactly what you are describing, the intercept and slope are given by the 2 coefficients fitted by the model. What you call the “slope corrected standard deviation” is basically the standard deviation of the residuals (real_data - prediction). If the model is valid the standard deviation of the residuals should be significantly smaller than the standard deviation of the unfitted data.

1 Like