The following code doesn’t cause nearly as much learning as it should.
Basically, I create two groups 0 and 1, each having two points, and want them separated by a sigmoid function, using mean square error as loss.
During training, the loss function isn’t decreasing, as reported by the callback:
using Flux
xs_test = [[0.5, 0.25],[0.5, 0.25],[0.5, 0.5],[0.5, 0.5]]
ys_test = [0, 0, 1, 1]
model = Dense(2, 1, σ)
model.W .= ones(1, 2)
model.b .= ones(1)
loss(x, y) = Flux.mse(model(x), y)
ps = params(model)
data = zip(xs_test, ys_test)
opt = Descent(0.1)
plot(loss.(xs_test, ys_test), label="before training")
Flux.train!(loss, ps, data, opt; cb = () -> println("Current loss: ", sum(loss.(xs_test, ys_test))))
plot!(loss.(xs_test, ys_test), label="after training")
But after training, there is no improvement in separation:
contour(0:.1:1, 0:.1:1, (x, y) -> model([x,y])[1], fill=true)
scatter!(first.(xs_test[1:2]), last.(xs_test[1:2]), label="group 0")
scatter!(first.(xs_test[3:4]), last.(xs_test[3:4]), label="group 1")