drarnau
October 15, 2020, 2:18pm
#1
Dear all,
I use the GLM package to compute a simple OLS regression with simulated data inside a loop. In some cases, the simulated data is such that OLS cannot be computed because PosDefException: matrix is not positive definite; Cholesky factorization failed.
. That’s completely fine. However, it stops my loop. I would like to (pseudo-code):
if ols_is_feasible
return lm(@formula(y ~ x), df)
else
return Inf
end
Any idea how to easily do that?
Thank you.
nilshg
October 15, 2020, 2:32pm
#2
1 Like
The documentation is not good for lm
. But you can estimate a model with perfect multi-colinearity
julia> t = (y = rand(100), x1 = 1:100, x2 = 1:100);
julia> lm(@formula(y ~ x1 + x2), t, true); # a positional argument `true` here
1 Like
drarnau
October 15, 2020, 2:57pm
#4
That’s brilliant, thank you.
drarnau
October 15, 2020, 2:58pm
#5
Interesting. Is one of the variables ignored here?
Yeah I guess x2
is ignored.
1 Like
The true
tells GLM to use a pivoted cholesky:
"""
DensePredChol{T}
A `LinPred` type with a dense Cholesky factorization of `X'X`
# Members
- `X`: model matrix of size `n` × `p` with `n ≥ p`. Should be full column rank.
- `beta0`: base coefficient vector of length `p`
- `delbeta`: increment to coefficient vector, also of length `p`
- `scratchbeta`: scratch vector of length `p`, used in [`linpred!`](@ref) method
- `chol`: a `Cholesky` object created from `X'X`, possibly using row weights.
- `scratchm1`: scratch Matrix{T} of the same size as `X`
- `scratchm2`: scratch Matrix{T} os the same size as `X'X`
"""
mutable struct DensePredChol{T<:BlasReal,C} <: DensePred
X::Matrix{T} # model matrix
beta0::Vector{T} # base vector for coefficients
delbeta::Vector{T} # coefficient increment
scratchbeta::Vector{T}
This file has been truncated. show original
and the interface:
function fit(::Type{LinearModel}, X::AbstractMatrix{<:Real}, y::AbstractVector{<:Real},
allowrankdeficient::Bool=false; wts::AbstractVector{<:Real}=similar(y, 0))
fit!(LinearModel(LmResp(y, wts), cholpred(X, allowrankdeficient)))
end
"""
lm(X, y, allowrankdeficient::Bool=false; wts=similar(y, 0))
An alias for `fit(LinearModel, X, y, allowrankdeficient)`
The arguments `X` and `y` can be a `Matrix` and a `Vector` or a `Formula` and a `DataFrame`.
The keyword argument `wts` can be a `Vector` specifying frequency weights for observations.
Such weights are equivalent to repeating each observation a number of times equal
to its weight. Do note that this interpretation gives equal point estimates but
different standard errors from analytical (a.k.a. inverse variance) weights and
from probability (a.k.a. sampling) weights which are the default in some other
software.
"""
lm(X, y, allowrankdeficient::Bool=false; kwargs...) =
fit(LinearModel, X, y, allowrankdeficient; kwargs...)
Additionally, I’ve filed an issue a while ago to make this more clear by making allowrankdeficient
a keyword argument here .
It would be a good first issue for someone who wants to get involved in linear estimation in Julia.
2 Likes