[ANN] RobustModels: Robust linear regression using M-Estimators and more

Hi,
I would like to present RobustModels.jl, a new package
to perform robust linear regressions using M-Estimator, MM-Estimator and τ-Estimator, but also quantile regression, MQuantile regression and robust Ridge regression.
It can also be used to calculate robust estimate of the mean and variance of a vector.
It can also be used to detect outliers, similar to LinRegOutliers.jl.

It is originally based on RobustLeastSquares.jl and it is using the same interface as GLM.jl for a better reusability.

You can find the documentation here: Home · RobustModels
The package is not registered yet, so install with:

] add https://github.com/getzze/RobustModels.jl
28 Likes

The package is now registered, install with:

] add RobustModels
2 Likes

Hi,

thank you very much for this package. I have one question upfront before I start my investigation of your package.
The Standard Deviation does not only reflect instability in terms of fluctuations it aslo reflects the slope (represented by the linear regression).
To decouple both: a) oscilation / fluctuation, b) drift/slope
I have in mind a new variant of standard deviation, in the first step I calculate the slope by means
of liear regression, in the second step I remove the slope of the investigated points (keeping the
algebraic mean constant) and in the 3rd step I calculate the slope corrected standard deviation.
I this a resonable approach? How would this be calculated by means of your module?
Regards,
Stefan

If you don’t have (or don’t care) about outliers in your data, you don’t need to use a robust linear regression, GLM.jl (https://github.com/JuliaStats/GLM.jl), actually just lm is enough.
Linear regression is doing exactly what you are describing, the intercept and slope are given by the 2 coefficients fitted by the model. What you call the “slope corrected standard deviation” is basically the standard deviation of the residuals (real_data - prediction). If the model is valid the standard deviation of the residuals should be significantly smaller than the standard deviation of the unfitted data.

1 Like

Hey, thank you for the post. LinRegOutliers.jl now implements the Quantile Regression estimator through a JuMP and GLPK interface in v0.8.16. Now I am now very curious about the performance differences between two implementations. It seems the RobustModels.jl implementation is also based on an constrained-optimization setting.

Hey! Yes I used Tulip.jl as it is in pure Julia. I do not use JuMP as it was tremendously increasing the time to load the package, but the direct interface.

Let me know if you do any kind of benchmarking!

1 Like

Nice project!

  • Are you aware of the Python sklearn extra package doing similar things, I think.

  • Myself I have started RobustMeans.jl which implement and defines some Robust estimators (Huber, MoM as well as some very recent ones). Maybe there is some possible cross over allowing in your robust regression mean from RobustMeans.jl i.e.
    \hat{\beta} =\arg\min_{\beta} \text{RobustMeanEstimator}[(\beta' X_i -Y_i)^2; 1\leq i \leq n].

Hi, thanks for the kind words and the link to your project, it looks very interesting. I had a look at the papers and Regression using MoM and other techniques need more work than just using the functions that you implemented but it would be great to add these techniques to RobustModels eventually. I will try to do that when I have some time, but feel free to make PR to add these estimators to RobustModels.jl for the 1D case!