The following errors:
using DataFrames, GLM
df = DataFrame(x1=[1,2,3,4], x2=[1,2,3,4], y=[1,1,0,0])
mdl = glm(@formula(y~x1+x2), df, Binomial(), LogitLink())
predict(mdl, df[:, [:x1, :x2]])
ERROR: LoadError: PosDefException: matrix is not positive definite; Cholesky factorization failed.
On the other hand, logistic regression is successfully done with ScikitLearn:
using DataFrames, ScikitLearn
@sk_import linear_model: LogisticRegression
df = DataFrame(x1=[1,2,3,4], x2=[1,2,3,4], y=[1,1,0,0])
mdl = fit!(LogisticRegression(), Matrix(df[:, [:x1, :x2]]), df.y)
ps = predict_proba(mdl, Matrix(df[:, [:x1, :x2]]))
Multi-collinearity cannnot be an issue for logistic regression unlike linear regression.
Then, any idea why GLM fails in this case?