Help with GLM logistic regression

jonjilla · March 10, 2021, 4:21pm

Can someone help me with using GLM for logistic regression.

I’m testing this out on the iris dataset.

using RDatasets
 iris = dataset("datasets","iris")

I’ve worked with this before and know that the PetalLength is a perfect predictor/classifier for the Setosa class

So I created a one hot encoded column called IsSetosa and then used logistic regression to model the input factors on the IsSetosa class

iris[:,:IsSetosa] = iris.Species.=="setosa"
logit = glm( @formula(IsSetosa ~ SepalLength+SepalWidth+PetalLength+PetalWidth), iris, Binomial(), ProbitLink())

Coefficients:
────────────────────────────────────────────────────────────────────────
                Coef.  Std. Error      z  Pr(>|z|)  Lower 95%  Upper 95%
────────────────────────────────────────────────────────────────────────
(Intercept)  -3.88972     7989.87  -0.00    0.9996  -15663.7    15656.0
SepalLength   2.88224     2279.99   0.00    0.9990   -4465.82    4471.59
SepalWidth    1.81596     1038.46   0.00    0.9986   -2033.52    2037.15
PetalLength  -4.91615     1903.18  -0.00    0.9979   -3735.08    3725.25
PetalWidth   -5.33176     2689.12  -0.00    0.9984   -5275.92    5265.25
────────────────────────────────────────────────────────────────────────

Two questions

how can I get access each value from this response per independent variable. For example, how can I get the pvalue for the PetalLength ?
From JMP, there is also a result from a “Whole Model Test” which is the overall result of the system on the output. Is there such an equivalent in the Julia GLM package (or other packages?)
JMP Help

Thanks in advance!

pdeffebach · March 10, 2021, 5:17pm

With regards to your first point, this is a bit of a limitation at the moment.

Here is a workaround to have everything in a more easy-to-access structure

julia> using StatsBase 

julia> function make_named_array(c::CoefTable)
           n_mat = reduce(hcat, c.cols) |> NamedArray
           setnames!(n_mat, c.rownms, 1)
           setnames!(n_mat, c.colnms, 2)
           return n_mat
       end;

julia> make_named_array(c)
2×6 Named Array{Float64,2}
      A ╲ B │      Coef.  Std. Error           z    Pr(>|z|)   Lower 95%   Upper 95%
────────────┼───────────────────────────────────────────────────────────────────────
(Intercept) │   0.223037    0.127292     1.75216   0.0797464  -0.0264519    0.472525
x           │  -0.189126    0.133999     -1.4114    0.158127   -0.451759   0.0735072

jonjilla · March 10, 2021, 5:35pm

pdeffebach:

function make_named_array(c::CoefTable)
           n_mat = reduce(hcat, c.cols) |> NamedArray
           setnames!(n_mat, c.rownms, 1)
           setnames!(n_mat, c.colnms, 2)
           return n_mat
       end;

Thanks for the help. I tried to use this function and passed the coef table from the output of logistic regression, but get this error

julia> make_named_array(coef(logit))
ERROR: MethodError: no method matching make_named_array(::Vector{Float64})

pdeffebach · March 10, 2021, 6:51pm

Ah coef returns a vector of the coefficients. You want make_named_array(coeftable(logit)).

Also make sure you have

using NamedArrays

at the top as well.

jonjilla · March 10, 2021, 7:01pm

Thanks, that works.

Topic		Replies	Views
Features coefficients - GLMNet.jl Modelling & Simulations question , package	0	121	March 24, 2024
Stepwise logistic regress - GLM - non-callable --> callable (Non-call expression encountered) Optimization (Mathematical) regression , glm	7	1265	October 10, 2018
How to obtain the pvalues of the coefficients in GLM.jl? Statistics glm	4	4202	May 10, 2021
GLM linear regression. How to extract the coefficients? Statistics glm	3	5600	August 30, 2019
What is the difference between the two methods of lm on GLM.jl Statistics regression , glm	2	468	February 25, 2023

Help with GLM logistic regression

Related topics