Can someone help me with using GLM for logistic regression.
I’m testing this out on the iris dataset.
using RDatasets
iris = dataset("datasets","iris")
I’ve worked with this before and know that the PetalLength is a perfect predictor/classifier for the Setosa class
So I created a one hot encoded column called IsSetosa and then used logistic regression to model the input factors on the IsSetosa class
iris[:,:IsSetosa] = iris.Species.=="setosa"
logit = glm( @formula(IsSetosa ~ SepalLength+SepalWidth+PetalLength+PetalWidth), iris, Binomial(), ProbitLink())
Coefficients:
────────────────────────────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|) Lower 95% Upper 95%
────────────────────────────────────────────────────────────────────────
(Intercept) -3.88972 7989.87 -0.00 0.9996 -15663.7 15656.0
SepalLength 2.88224 2279.99 0.00 0.9990 -4465.82 4471.59
SepalWidth 1.81596 1038.46 0.00 0.9986 -2033.52 2037.15
PetalLength -4.91615 1903.18 -0.00 0.9979 -3735.08 3725.25
PetalWidth -5.33176 2689.12 -0.00 0.9984 -5275.92 5265.25
────────────────────────────────────────────────────────────────────────
Two questions
- how can I get access each value from this response per independent variable. For example, how can I get the pvalue for the PetalLength ?
- From JMP, there is also a result from a “Whole Model Test” which is the overall result of the system on the output. Is there such an equivalent in the Julia GLM package (or other packages?)
JMP Help
Thanks in advance!