How to check if GLM will be able to run OLS beforehand?

drarnau · October 15, 2020, 2:18pm

Dear all,

I use the GLM package to compute a simple OLS regression with simulated data inside a loop. In some cases, the simulated data is such that OLS cannot be computed because PosDefException: matrix is not positive definite; Cholesky factorization failed.. That’s completely fine. However, it stops my loop. I would like to (pseudo-code):

if ols_is_feasible
  return lm(@formula(y ~ x), df)
else
  return Inf
end

Any idea how to easily do that?

Thank you.

nilshg · October 15, 2020, 2:32pm

I think you are looking for a try/catch statement: Control Flow · The Julia Language

pdeffebach · October 15, 2020, 2:50pm

The documentation is not good for lm. But you can estimate a model with perfect multi-colinearity

julia> t = (y = rand(100), x1 = 1:100, x2 = 1:100);

julia> lm(@formula(y ~ x1 + x2), t, true); # a positional argument `true` here

drarnau · October 15, 2020, 2:57pm

That’s brilliant, thank you.

drarnau · October 15, 2020, 2:58pm

Interesting. Is one of the variables ignored here?

pdeffebach · October 15, 2020, 3:00pm

Yeah I guess x2 is ignored.

dave.f.kleinschmidt · October 27, 2020, 4:45pm

The true tells GLM to use a pivoted cholesky:

https://github.com/JuliaStats/GLM.jl/blob/ef246bb8fdbfa3f3058435035d0b0cf42abdd06e/src/linpred.jl#L80-L117

and the interface:

https://github.com/JuliaStats/GLM.jl/blob/29a0e1c574a51d8574a79872b6c250364164a420/src/lm.jl#L125-L145

pdeffebach · October 27, 2020, 5:14pm

Additionally, I’ve filed an issue a while ago to make this more clear by making allowrankdeficient a keyword argument here.

It would be a good first issue for someone who wants to get involved in linear estimation in Julia.

pdeffebach · November 1, 2020, 7:32pm

PR here.

Topic		Replies	Views
Linear regression with a positive definite matrix in GLM.jl? Statistics glm	11	2712	February 1, 2019
Error in Regression but I don't think there is collinearity: "PosDefException: matrix is not positive definite; Cholesky factorization failed." New to Julia dataframes , glm	0	315	March 7, 2021
GLM.jl LogisticRegression errors: matrix is not positive definite; Cholesky factorization failed Statistics question , glm	14	5003	June 9, 2022
Fail to do multiple linear regression using GLM Statistics regression , glm	7	1879	December 18, 2018
GLMM - Cholesky factorization failed Statistics question	9	1067	August 5, 2020

How to check if GLM will be able to run OLS beforehand?

Related topics