Linear regression without the intercept term

leejm516 · November 27, 2018, 1:36pm

In GLM.jl, the use of DataFrame is preferred, but the lm function does support the use of vectors and matrices. In the latter case, however, I can’t do fit without the intercept term (i.e. b0 = 0). Is there a way to do this without using DataFrame?

When I saw the source codes, the X argument in the lm function should be AbstractMatrix, not AbstractVector.

Certainly, when using DataFrame, there is no problem at all; this question is just my curiosity

julia> using GLM; x = [1,2,3]; y = [2,5,7];

julia> ols = lm(@formula(Y ~ 0 + X), data)
StatsModels.DataFrameRegressionModel{LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}

Formula: Y ~ +X

Coefficients:
     Estimate Std.Error t value Pr(>|t|)
X     2.35714 0.0874818 26.9444   0.0014


julia> ols = lm(x, y) # regression without the intercept term
ERROR: MethodError: no method matching fit(::Type{LinearModel}, ::Array{Int64,1}, ::Array{Int64,1}, ::Bool)
Closest candidates are:
  fit(::Type{StatsBase.Histogram}, ::Any...; kwargs...) at C:\Users\leejm516\.julia\packages\StatsBase\56Djy\src\hist.jl:319
  fit(::StatsBase.StatisticalModel, ::Any...) at C:\Users\leejm516\.julia\packages\StatsBase\56Djy\src\statmodels.jl:151
  fit(::Type{D<:Distributions.Distribution}, ::Any...) where D<:Distributions.Distribution at C:\Users\leejm516\.julia\packages\Distributions\WHjOk\src\genericfit.jl:34
  ...
Stacktrace:
 [1] lm(::Array{Int64,1}, ::Array{Int64,1}, ::Bool) at C:\Users\leejm516\.julia\packages\GLM\0c65q\src\lm.jl:146 (repeats 2 times)
 [2] top-level scope at none:0

julia> ols = lm([ones(3) x], y) # regression with the intercept term
LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}}:

Coefficients:
      Estimate Std.Error   t value Pr(>|t|)
x1   -0.333333   0.62361 -0.534522   0.6875
x2         2.5  0.288675   8.66025   0.0732


julia>

andreasnoack · November 27, 2018, 1:48pm

You just have to make x a matrix.

julia> using GLM; x = [1,2,3]; y = [2,5,7];

julia> lm(reshape(x, length(x), 1), y)
LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}}:

Coefficients:
     Estimate Std.Error t value Pr(>|t|)
x1    2.35714 0.0874818 26.9444   0.0014

nalimilan · November 27, 2018, 3:47pm

We should probably add a convenience function for this, as it keeps coming up.

leejm516 · November 28, 2018, 12:09pm

Yeah… what a silly question Thank you for kind replying.

ellocco · January 19, 2023, 11:22pm

The next step is missing for me:

ops = lm(reshape(x, length(x), 1), y)
y_linreg_fitted = coef(ops) .* x

mousum · January 24, 2023, 6:43am

Or
fitted(ops)

xiaodai · March 8, 2023, 8:09am

How do you do this for logistic regression?

nilshg · March 8, 2023, 9:14am

Same thing?

julia> using GLM

julia> df = (x = rand(10), y = rand(Bool, 10));

julia> glm(@formula(y ~ 0 + x), df, Bernoulli(), LogitLink())
StatsModels.TableRegressionModel{GeneralizedLinearModel{GLM.GlmResp{Vector{Float64}, Bernoulli{Float64}, LogitLink}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

y ~ 0 + x

Coefficients:
──────────────────────────────────────────────────────────────
      Coef.  Std. Error      z  Pr(>|z|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────
x  -1.32166     1.47523  -0.90    0.3703   -4.21306    1.56975
──────────────────────────────────────────────────────────────

julia> glm(reshape(df.x, 10, 1), df.y, Bernoulli(), LogitLink())
GeneralizedLinearModel{GLM.GlmResp{Vector{Float64}, Bernoulli{Float64}, LogitLink}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}:

Coefficients:
───────────────────────────────────────────────────────────────
       Coef.  Std. Error      z  Pr(>|z|)  Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────
x1  -1.32166     1.47523  -0.90    0.3703   -4.21306    1.56975
───────────────────────────────────────────────────────────────

Topic		Replies	Views
How do I a logistic regression without the intercept in GLM.jl? Machine Learning	4	835	March 8, 2023
Multivariate OLS General Usage	17	5832	November 13, 2018
GLM.jl Is posulbie to use lm() for two vectors without DataFrames? General Usage glm	2	621	November 4, 2019
The simplest linear fit with GLM Tooling glm	13	5259	November 11, 2021
How to use the second method of GLM.lm? Statistics	2	71	October 11, 2024

Linear regression without the intercept term

Related topics