I’m trying to understand how generalized linear models work by coding a simple example from scratch and I’m hoping some of the stats experts here (I know there are several of you lurking around ) can help. In the case of simple linear regression, the code is super simple:
using Random
using StatsPlots
X = 1:10
Y = rand(collect(-1:0.1:1), 10) .+ collect(1:2:20)
β = hcat(ones(X), X) \ Y # estimate coefficients
Yᵢ(xᵢ) = β[1] + β[2]xᵢ # function to estimate Yᵢ
scatter(X, Y, legend=false, xlabel="X", ylabel="Y")
plot!(X, Yᵢ.(X))
Now, I’m trying to understand how to code a simple GLM for something like this:
using DataFrames
using GLM
using Random
using StatsPlots
X = 1:10
Y2 = Y.^4 .+ Y
scatter(X, Y2, legend=false, xlabel="X", ylabel="Y")
With the GLM package, it’s obviously very easy to fit a model to this:
model = glm(@formula(Y2 ~ X), DataFrame(X=X, Y2=Y2), Normal(), LogLink())
But I’m trying to understand how to code something like I’ve done above, but for this example.
Wikipedia has a decent example of how to compute the expected value of Yᵢ:
I guess my specific questions are:
Is g simply the log() function in this case?
β = hcat(ones(X), X) \ Y no longer holds true for this case and I need to estimate β via MLE. Is this correct? Since Distributions.jl uses MLE for fitting distributions, is it possible to do what I need with that package or do I need to use Optim.jl to solve this problem (or am I totally wrong about all of this )?
Hi Tamas - yes, GLM makes it very easy to fit models to these examples. What I’m trying to do is to figure out how to code it myself. For example, the estimates for the coefficients that are produced by
ols = lm(@formula(Y ~ X), DataFrame(X=X,Y=Y))
can be found very simply withβ = hcat(ones(X), X) \ Y and then the simple function Yᵢ(xᵢ) = β[1] + β[2]xᵢ will produce a point estimate.
I’m trying to figure out how to do the same, but for the exponential example. In GLM, it’s
model = glm(@formula(Y2 ~ X), DataFrame(X=X, Y2=Y2), Normal(), LogLink())
so I need to figure out how to estimate β and then I believe I can plug that, along with log in place of g, into the below function to produce point estimates:
If you are interested in doing maximum likelihood estimation of your coefficient vector beta, then you could try Googling “generalized linear models Fisher’s scoring” and “generalized linear models iteratively weighted least squares”.
Unlike linear models, generalized linear models do not have closed form solutions. You need to use iterative algorithms.
Newton’s method (also known as the Newton-Raphson method)
Fisher’s scoring method
Iteratively reweighted least squares (IRLS)
IRLS is basically just a rewritten form of Fisher’s scoring, so the two are exactly equivalent,
Newton-Raphson is equivalent to Fisher’s scoring if you are using the canonical link function. If your link function is not the canonical link function, then Newton-Raphson is not necessarily equivalent to Fisher’s scoring.
As pointed out by @dilumaluthge, there is no closed form solution in general; various iterative methods exist using the special structure of the problem. Practical implementations use some standard tricks to help with the usual numerical problems (mostly conditioning), the key algorithm is IRLS. GLM.jl uses this and it is a nice codebase to follow along.
If you are interested in the numerical implementation of these algorithms, another nice package to read is