How to make a basic classification in flux?

I am trying to make a basic classification in Flux, but I am obtainining crazy errors. Where am I wrong ?

These are my attempts:

Version 1: basic of the basic

using Random, Flux

x = [0.1 10; 0.13 13; 0.17 17; 0.2 20; 1 1; 1.3 1.3; 1.7 1.7; 2 2; 10 0.1; 13 0.13; 17 0.17; 20 0.2]
y = [1,1,1,1,2,2,2,2,3,3,3,3]

Random.seed!(123)

l1 = Dense(2,3,Flux.relu)
l2 = Flux.softmax
Flux_nn    = Flux.Chain(l1,l2)
loss(x, y) = Flux.crossentropy(Flux_nn(x), y)
ps         = Flux.params(Flux_nn)
nndata     = Flux.Data.DataLoader((x', y'), batchsize=3,shuffle=true)
Flux.@epochs 300 Flux.train!(loss, ps, nndata, Flux.ADAM())
ŷ = Flux.onecold(Flux_nn(x'),1:3)
acc = sum(y .== ŷ) / length(y) # 0.33

Version 2: using directly onehotencoded data

x2 = x'
y_oh = Flux.onehotbatch(y,1:3)

Random.seed!(123)

l1 = Dense(2,3,Flux.relu)
l3 = Flux.softmax
Flux_nn    = Flux.Chain(l1,l3)
loss(x, y) = Flux.crossentropy(Flux_nn(x), y)
ps         = Flux.params(Flux_nn)
nndata     = Flux.Data.DataLoader((x2, y_oh), batchsize=3,shuffle=true)
Flux.@epochs 300 Flux.train!(loss, ps, nndata, Flux.ADAM())
ŷ = Flux.onecold(Flux_nn(x'),1:3)
acc = sum(y .== ŷ) / length(y) # 0.66.. still...

Version 3: with more training data…

x = rand(MersenneTwister(123),200,3)

function makey(x)
    if x[1] + 2 * x[2] - 3* x[3] < 0 
        return 1
    elseif x[1] + 2 * x[2] - 3* x[3] < 1
        return 2
    else
        return 3
    end    
end

y = [makey(r) for r in eachrow(x)] 

Random.seed!(123)

l1 = Dense(3,5,Flux.relu)
l2 = Dense(5,3,Flux.relu)
l3 = Flux.softmax
Flux_nn    = Flux.Chain(l1,l2,l3)
loss(x, y) = Flux.crossentropy(Flux_nn(x), y)
ps         = Flux.params(Flux_nn)
nndata     = Flux.Data.DataLoader((x', y'), batchsize=8,shuffle=true)
Flux.@epochs 300 Flux.train!(loss, ps, nndata, Flux.ADAM())
ŷ = Flux.onecold(Flux_nn(x'),1:3)
acc = sum(y .== ŷ) / length(y) # 0.54

The first version does not have an hidden layer (also the second version) and the y data is not compatible with softmax.

Try with this:

using Random
using Flux
Random.seed!(123)

x = Float32.([0.1 10; 0.13 13; 0.17 17; 0.2 20; 1 1; 1.3 1.3; 1.7 1.7; 2 2; 10 0.1; 13 0.13; 17 0.17; 20 0.2]')
y = [1,1,1,1,2,2,2,2,3,3,3,3] |>
        i -> Flux.onehotbatch(i, 1:3) .|> 
        Float32
nndata = Flux.Data.DataLoader((x, y), batchsize=3,shuffle=true)

Flux_nn = Chain(
    Dense(2, 10, relu),
    Dense(10, 3),    # no relu here
    softmax
)
ps = Flux.params(Flux_nn)

loss(x, y) = Flux.crossentropy(Flux_nn(x), y)

opt = ADAM()
Flux.@epochs 50 Flux.train!(loss, ps, nndata, opt)     

acc = sum(Flux.onecold(Flux_nn(x), 1:3) .== Flux.onecold(y, 1:3)) / size(y, 2) # 1.0

Thank you, basically my second version was fine except it didn’t have enough “neurons” (I thought they weren’t needed for something so simple).

A s a note, I learned that in Flux you can use a logitcrossentropy as a loss function instead of a softmax output layer plus crossentropy loss to reduce numerical round-up errors, i.e.:

x = [0.1 10; 0.13 13; 0.17 17; 0.2 20; 1 1; 1.3 1.3; 1.7 1.7; 2 2; 10 0.1; 13 0.13; 17 0.17; 20 0.2]
y = [1,1,1,1,2,2,2,2,3,3,3,3]
x2 = x'
y_oh = Flux.onehotbatch(y,1:3)
Random.seed!(123)
l1         = Dense(2,10,Flux.relu)
l2         = Dense(10, 3)
# no softmax here...
Flux_nn    = Flux.Chain(l1,l2)
loss(x, y) = Flux.logitcrossentropy(Flux_nn(x), y)
ps         = Flux.params(Flux_nn)
nndata     = Flux.Data.DataLoader((x2, y_oh), batchsize=3,shuffle=true) e
Flux.@epochs 50 Flux.train!(loss, ps, nndata, Flux.ADAM())
ŷ          = Flux.onecold(Flux_nn(x2),1:3)
acc        = sum(y .== ŷ) / length(y) # 1.0