Logistic Regression Problem

Antonio_Loureiro · August 16, 2017, 7:37pm

I’m trying to use a Logistic Regression algorithm to find a classification model, but i get stuck with an error “failure to converge after 30 iterations” i have changed the maxIter arg to an higher value, but the error only disappears with a very high value of iterations, in the small example below with a population of 1000 elements and with a very simple implicit model i need 2000 iterations! Am i doing something wrong?

Blockquote
df=DataFrame()
n=1000
df[:x]=rand(n)
df[:y]=rand(n)
df[:z]=rand(n)
df[:valid]=map((x,y)->(x2-y6)>0? true : false,df[:x],df[:y])
glm(@formula(valid ~ x+y+z), df, Binomial(), LogitLink())

andreasnoack · August 16, 2017, 7:58pm

Since there is no noise added to the linear predictor, it predicts the outcomes perfectly and, as a consequence, the MLE doesn’t exist, i.e. the likelihood function doesn’t have an optimum but keeps growing as one or more of the model parameters diverge. See e.g. FAQ What is complete or quasi-complete separation in logistic/probit regression and how do we deal with them?. In ML, people usually add some regularization to ensure that an optimum exists but GLM doesn’t add regularization.

Antonio_Loureiro · August 16, 2017, 9:57pm

Thks, that was a perfect answer!

piever · August 16, 2017, 9:59pm

Is it planned functionality to add some simple normalization to GLM? If, as I’m assuming, the fitting of a GLM is essentially Newton-Raphson algorithm, it should be easy to add some optional L2 cost parameter. It could be a big plus for usability, otherwise it can be confusing for a new user that, as soon as your problem is a bit ill-defined/unstable, you need to switch to I guess GLMnet or Lasso which is a different package with different syntax etc.

Topic		Replies	Views
GLM.jl LogisticRegression errors: matrix is not positive definite; Cholesky factorization failed Statistics question , glm	14	4978	June 9, 2022
Why does GLM.jl fail to fit my logistic regression model but scikit learn and R has no issues? I think I found the reason Modelling & Simulations glm , r	5	302	October 1, 2024
Logistic regression for data with missing values Probabilistic programming question , turing	1	84	November 22, 2024
Unable to fit my ML model New to Julia	6	317	July 1, 2021
PosDefException: matrix is not positive definite; Cholesky factorization failed Statistics	9	1596	July 30, 2021

Logistic Regression Problem

Related topics