# Regression Variance Inflation Factor (VIF)

Hello, how do I check for VIF Variance inflation factor - Wikipedia in a GLM.jl `lm` model? I went through the JuliaStats org repositories and package documentations and could not find anything.

Hey. The following works w/ GLM:

``````using DataFrames, GLM

# Make data
N=10_000;
X0 = ones(N); X1=randn(N); X2=randn(N); X3=randn(N); Īµ = randn(N);
Ī²_0 = 4.0; Ī²_1 = 2.0; Ī²_2 =10.5; Ī²_3 =-9.5; Ļ_Īµ =1.0;
y = Ī²_0*X0 .+ Ī²_1*X1 .+ Ī²_2*X2 .+ Ī²_3*X3 .+ Ļ_Īµ*Īµ
X = [one.(X1) X1 X2 X3]
d = DataFrame(X1=X1, X2=X2, X3=X3, y=y)

# linear algebra
Ī²_LS = X \ y

# GLM
m_y  = lm(@formula(y ~ 1 + X1 + X2 + X3), d)
m_X1 = lm(@formula(X1 ~ 1 + X2 + X3), d)
m_X2 = lm(@formula(X2 ~ 1 + X1 + X3), d)
m_X3 = lm(@formula(X3 ~ 1 + X1 + X2), d)

vif_X1 = 1.0/(1.0-r2(m_X1))
vif_X2 = 1.0/(1.0-r2(m_X2))
vif_X3 = 1.0/(1.0-r2(m_X3))
``````

Ok, so it has to be done by handā¦

1. Correct. GLM.jl currently does not have a `vif` function.
2. RegTools.jl, has it, but is no longer maintainedā¦
3. Itās easy to do w/ linear algebra or to define your own function
``````using Statistics, LinearAlgebra
vifm = diag(inv(cor(X[:,2:end])))
vifm ā [vif_X1; vif_X2; vif_X3;]  # verify result above

# own function
vif_GLM(mod) = diag(inv(cor(mod.model.pp.X[:,2:end])))

julia> vif_GLM(m_y) ā vifm ā [vif_X1; vif_X2; vif_X3;]
true
``````
2 Likes

Try w/ ārealā data:

``````using DataFrames, GLM, Statistics, LinearAlgebra, RDatasets
airquality = rename(dataset("datasets", "airquality"), "Solar.R" => "Solar_R")
#
y =  airquality.Wind
X1 = airquality.Temp
X2 = airquality.Solar_R
X3 = airquality.Ozone
d = DataFrame(X1=X1, X2=X2, X3=X3, y=y)
d = d[completecases(d), :]
X = [one.(d.X1) d.X1 d.X2 d.X3]  .|> Float64
y = d.y .|> Float64
#
Ī²_LS = X \ y
vifm = diag(inv(cor(X[:,2:end])))
#
m_y  = lm(@formula(y ~ 1 + X1 + X2 + X3), d)
m_X1 = lm(@formula(X1 ~ 1 + X2 + X3), d)
m_X2 = lm(@formula(X2 ~ 1 + X1 + X3), d)
m_X3 = lm(@formula(X3 ~ 1 + X1 + X2), d)

vif_X1 = 1.0/(1.0-r2(m_X1))
vif_X2 = 1.0/(1.0-r2(m_X2))
vif_X3 = 1.0/(1.0-r2(m_X3))
#
vifm ā [vif_X1; vif_X2; vif_X3;]
#
vif_GLM(mod) = diag(inv(cor(mod.model.pp.X[:,2:end])))
vif_GLM(m_y) ā vifm ā [vif_X1; vif_X2; vif_X3;]
``````

Thank you!

1 Like