LIBSVM.jl with imbalanced dataset

juliohm · February 14, 2023, 4:36pm

I am trying to use LIBSVM.jl with an imbalanced dataset.

Providing weights for each class doesn’t seem to affect the result:

using CSV
using DataFrames
using LIBSVM

df = CSV.read("svm.csv", DataFrame)

X = [df.x1 df.x2 df.x3]'
y = df.y
w = Dict(l => 1 / count(==(l), y) for l in unique(y))
svm = svmtrain(X, y, weights=w)

x1 = range(-0.5,0.5, length=100)
x2 = range(-0.5,0.5, length=100)
x3 = range(-0.5,0.5, length=100)
xs = [collect(x) for x in Iterators.product(x1,x2,x3)]
X̂ = reduce(hcat, xs)
ŷ, _ = svmpredict(svm, X̂)

Can you reproduce the issue? I’ve uploaded the dataset in this gist:

gist.github.com

https://gist.github.com/juliohm/a0c98c0d386d297e2105818652faa076

svm.csv

x1,x2,x3,y
-0.16485239157797732,-0.2218786857477146,0.025657251025896506,1
-0.14138283354447023,-0.18115496792737906,-0.08881228951195215,2
-0.19156671553489246,-0.21063046433580623,-0.2917535547111554,3
-0.279588288598788,-0.27721378406398256,0.015351159741035935,4
-0.3226797091143479,-0.3261822936012488,0.10225411867529889,2
0.36719732717363746,0.4162419801073728,0.15575307734063756,2
-0.2758953203829891,-0.2474704632695943,-0.14546736263944215,2
-0.271400774910694,-0.26209306422374024,-0.005437600507968055,4
-0.4938512097624689,-0.3518243528882096,-0.07185283149402386,3

This file has been truncated. show original

Topic		Replies	Views
LIBSVM vs sklearn Machine Learning	0	323	September 13, 2022
[ANN] Imbalance.jl - A well-documented, multi-interface and comprehensive Julia toolbox for addressing class imbalance Package Announcements package , announcement , machine-learning , mlj , classification	3	737	October 12, 2023
Working One-class SVM in Julia? General Usage question	6	906	July 28, 2020
LIBSVM crashes after sucessfully running a few times Machine Learning question	1	651	June 29, 2020
Can not run simple example with LIBSVM Machine Learning question	4	221	February 1, 2023

LIBSVM.jl with imbalanced dataset

Related topics