How do I do a binary classification with GLMnet, or Lasso?


#1

I wanted to include GLMnet and Lasso in
http://white.ucc.asn.au/2017/12/18/7-Binary-Classifier-Libraries-in-Julia.html

But I outright could not work out how to use them.

I feel like they can be used for this,
but maybe I just don’t understand what they are for at all


#2

You mean theoretically? These “shrinkage” methods have two common uses

  1. they can be used to shrink (often to 0) parameters estimates in a large parameter model, e.g. you have 100 candidate parameters and when you fit the model many of them look significant but you wanted a parsimonious model, so you would use things like elastic net etc.
  2. when you don’t have that much data but you want to stabilise some of parameter estimates to make them more moderate; there are some studies that shows some of these methods are equivalent to adding data points to your dataset.

This is a pretty rough overview, but the only times I recall applying these methods is in situation 1. where I have lots of candidate explanatory variables but I wanted to reasonable check on the parameters to estimate.

Also using these method can be tricky sometimes as you often have to tune some hyperparameters as well. These parameters can serve many purposes, but some important one control how much large parameter estimates impact your final estimate, and to find the “optimal” hyperparameter you would often use cross-validation technique which are computationally intensive, so I didn’t find them that useful for really large datasets. Also there are other ways to help select variables in large-data settings, so you don’t have to use these shrinkage methods; but they are still useful to understand and to benchmark your results with.


#3

I have used GLMnet with LASSO quite a bit for classification in R. It’s just like other GLM-like linear models; you give it input variables, a boolean output variable, and it returns a model. Actually it gives you a range of models that vary by lambda, its hyperparameter - different lambdas give you different coefficient mixes. So, you can then use CV or similar methods to get the best-performing lambda. GLMnet also works well for cox and regression models. Elasticnet is related, with a better penalty mechanism.

But this may not be what you were asking. I can’t tell. What did you try and what didn’t work?


#4

If you’re looking for documentation on expressing classification as a GLMnet problem,
have you looked at the “Logistic” section in the test suite of GLMnet.jl, and maybe the examples for the original R package(PDF)?

If you want a suggestion for making your example suitable for this sort of thing, consider regressing on coefficients for an expansion in orthogonal polynomials of the coordinates.


#5

GLMnet and Lasso are both regularized (or Bayesian) estimators. However, those are not classification models; those can be used in linear probability model or probit/logit as probability models. One can then use the probability estimates for a linear binary classifier in the second step. An alternative is to train a linear classifier.