What’s the best ML package to use for multi-label classification in Julia? Hopefully, it will have multiple algorithms. Else, do suggest particular packages too. MLJ? Flux? or TensorFlow etc
It is now possible to implement multi-label classification in BetaML, my own ML library, in a easy way using the Neural network module:
using BetaML # min v0.5.5 due to the new weightless ScalarFunctionLayer
# Creating test data..
X = rand(2000,2)
# note Y is a (2000 x 3) matrix of 0.0/1.0 floats...
Y = hcat(round.(tanh.(0.5 .* X[:,1] + 0.8 .* X[:,2])),
round.(tanh.(0.5 .* X[:,1] + 0.3 .* X[:,2])),
round.(tanh.(max.(0.0,-3 .* X[:,1].^2 + 2 * X[:,1] + 0.5 .* X[:,2]))))
# Creating the NN model...
l1 = DenseLayer(2,10,f=relu)
l2 = DenseLayer(10,3,f=relu)
# Needed without weigths as I want to be sure that the input to tanh is positive:
l3 = ScalarFunctionLayer(3,f=tanh)
mynn = buildNetwork([l1,l2,l3],squaredCost,name="Multinomial multilabel regression Model")
# Train of the model...
train!(mynn,X,Y,epochs=100,batchSize=8)
# Predictions...
ŷ = round.(predict(mynn,X))
(nrec,ncat) = size(Y)
# Just a basic accuracy measure. I could think to extend the ConfusionMatrix measures to multi-label classification if needed..
overallAccuracy = sum(ŷ .== Y)/(nrec*ncat) # 0.988
I initially thought on using softmax with a learnable beta
parameter, but then I realised that such way is not possible: how would the model be able to distinguish between Y = [0 0 0]
and Y = [1 1 1]
? So I ended up with a weighless tanh
layer preceded by a relu
function that guarantee me an output in the [0,1] range for each label “independently”, and setting the threshold on 0.5, the value that maximise the loss.
Have a look at https://github.com/beacon-biosignals/Lighthouse.jl and its associated extension packages.