How to check if GLM will be able to run OLS beforehand?

Dear all,

I use the GLM package to compute a simple OLS regression with simulated data inside a loop. In some cases, the simulated data is such that OLS cannot be computed because PosDefException: matrix is not positive definite; Cholesky factorization failed.. That’s completely fine. However, it stops my loop. I would like to (pseudo-code):

if ols_is_feasible
  return lm(@formula(y ~ x), df)
else
  return Inf
end

Any idea how to easily do that?

Thank you.

I think you are looking for a try/catch statement: Control Flow · The Julia Language

1 Like

The documentation is not good for lm. But you can estimate a model with perfect multi-colinearity

julia> t = (y = rand(100), x1 = 1:100, x2 = 1:100);

julia> lm(@formula(y ~ x1 + x2), t, true); # a positional argument `true` here
1 Like

That’s brilliant, thank you.

Interesting. Is one of the variables ignored here?

Yeah I guess x2 is ignored.

1 Like

The true tells GLM to use a pivoted cholesky:

https://github.com/JuliaStats/GLM.jl/blob/ef246bb8fdbfa3f3058435035d0b0cf42abdd06e/src/linpred.jl#L80-L117

and the interface:

https://github.com/JuliaStats/GLM.jl/blob/29a0e1c574a51d8574a79872b6c250364164a420/src/lm.jl#L125-L145

Additionally, I’ve filed an issue a while ago to make this more clear by making allowrankdeficient a keyword argument here.

It would be a good first issue for someone who wants to get involved in linear estimation in Julia.

2 Likes

PR here.