How to implement One-Vs-Rest for Multi-Class Classification?

Greetings, I’m new to Julia and i am trying to implement One-Vs-Rest Multi-Class Classification, and I was wondering if anyone could help me out. Here is a snippet of my code so far:
My data frame is basic since I’m trying to figure out the implementation first, my c column is my class consisting of [0, 1, 2], and my y, x1, x2, x3 are random Int64 values.
using DataFrames
using CSV
using StatsBase
using StatsModels
using Statistics
using Plots, StatsPlots
using GLM
using Lathe

df = DataFrame(CSV.File(“data.csv”))

fm = @formula(c~x1+x2+x3+y)

model0 = glm(fm0, df, Binomial(), ProbitLink()) # 0 vs [1,2]
model1 = glm(fm1, df, Binomial(), ProbitLink()) # 1 vs [0,2]
model2 = glm(fm2, df, Binomial(), ProbitLink()) # 2 vs [0,1]

I am trying to make logistic models but I don’t know how to do it :frowning:
If anyone can help me out, I would be thrilled.

What exactly are you struggling with? In your code snippet you are using variables fm0, fm1, and fm2verify are not defined. It seems to me that defining these in line with your goal (by generating three target variables in your data andputting the correct left hand side variable into each formula) should do the trick?

On an unrelated note, could you explain why you are using Lathe? This package pops up occasionally on this forum, usually when it has created problems for people by holding back dependencies (as it hasn’t been updated in over a year and a half).

Couldn’t figure out the formula’s, but I did figure it out in the end:
fm0=@formula((c==0~x1+x2+x3))
fm1=@formula((c==1~x1+x2+x3))
fm2=@formula((c==2~x1+x2+x3))

I am using Lathe for splitting the train and the test sets:
dfTrain, dfTest = Lathe.preprocess.TrainTestSplit(df, .8)

Is there any other way I can split the train and test sets without using Lathe?

Yeah you definitely don’t need a massive outdated dependency like Lathe for that. If you do want to rely on a package for something this simple you can look at GitHub - JuliaML/MLUtils.jl: Utilities and abstractions for Machine Learning tasks, but you could also just use StatsBase to sample row indices for the train set (and then invert them to get the test set).

1 Like