drarnau
October 15, 2020, 2:18pm
#1
Dear all,

I use the GLM package to compute a simple OLS regression with simulated data inside a loop. In some cases, the simulated data is such that OLS cannot be computed because `PosDefException: matrix is not positive definite; Cholesky factorization failed.`

. That’s completely fine. However, it stops my loop. I would like to (pseudo-code):

```
if ols_is_feasible
return lm(@formula(y ~ x), df)
else
return Inf
end
```

Any idea how to easily do that?

Thank you.

nilshg
October 15, 2020, 2:32pm
#2
1 Like

The documentation is not good for `lm`

. But you can estimate a model with perfect multi-colinearity

```
julia> t = (y = rand(100), x1 = 1:100, x2 = 1:100);
julia> lm(@formula(y ~ x1 + x2), t, true); # a positional argument `true` here
```

1 Like

drarnau
October 15, 2020, 2:57pm
#4
That’s brilliant, thank you.

drarnau
October 15, 2020, 2:58pm
#5
Interesting. Is one of the variables ignored here?

Yeah I guess `x2`

is ignored.

1 Like

The `true`

tells GLM to use a pivoted cholesky:

"""
DensePredChol{T}
A `LinPred` type with a dense Cholesky factorization of `X'X`
# Members
- `X`: model matrix of size `n` × `p` with `n ≥ p`. Should be full column rank.
- `beta0`: base coefficient vector of length `p`
- `delbeta`: increment to coefficient vector, also of length `p`
- `scratchbeta`: scratch vector of length `p`, used in [`linpred!`](@ref) method
- `chol`: a `Cholesky` object created from `X'X`, possibly using row weights.
- `scratchm1`: scratch Matrix{T} of the same size as `X`
- `scratchm2`: scratch Matrix{T} os the same size as `X'X`
"""
mutable struct DensePredChol{T<:BlasReal,C} <: DensePred
X::Matrix{T} # model matrix
beta0::Vector{T} # base vector for coefficients
delbeta::Vector{T} # coefficient increment
scratchbeta::Vector{T}

This file has been truncated. show original

and the interface:

function fit(::Type{LinearModel}, X::AbstractMatrix{<:Real}, y::AbstractVector{<:Real},
allowrankdeficient::Bool=false; wts::AbstractVector{<:Real}=similar(y, 0))
fit!(LinearModel(LmResp(y, wts), cholpred(X, allowrankdeficient)))
end
"""
lm(X, y, allowrankdeficient::Bool=false; wts=similar(y, 0))
An alias for `fit(LinearModel, X, y, allowrankdeficient)`
The arguments `X` and `y` can be a `Matrix` and a `Vector` or a `Formula` and a `DataFrame`.
The keyword argument `wts` can be a `Vector` specifying frequency weights for observations.
Such weights are equivalent to repeating each observation a number of times equal
to its weight. Do note that this interpretation gives equal point estimates but
different standard errors from analytical (a.k.a. inverse variance) weights and
from probability (a.k.a. sampling) weights which are the default in some other
software.
"""
lm(X, y, allowrankdeficient::Bool=false; kwargs...) =
fit(LinearModel, X, y, allowrankdeficient; kwargs...)

Additionally, I’ve filed an issue a while ago to make this more clear by making `allowrankdeficient`

a keyword argument here .

It would be a good first issue for someone who wants to get involved in linear estimation in Julia.

2 Likes