Why the Loss function does not decrease significantly in Flux.jl

After trying some optimizations on activation function and epochs value , it is not possible to fit the model to y data which is a function of the input data.

using Flux, Plots, Statistics
x = Array{Float64}(rand(5, 100));
w = [diff(x[1,:]); 0]./x[1,:];
y1 = cumsum(cos.(cumsum(w))); 
scatter(y1)
y = reshape(y1, (1, 100));
data = [(x, y)];
model = Chain(Dense(5 => 100), Dense(100 => 1), identity)
model[1].weight;
loss(m, x, y) = Flux.mse(m(x), y)
Flux.mse(model(x), y)
Flux.mse(model(x), y) == mean((model(x) .- y).^2)
opt_stat = Flux.setup(ADAM(), model)

loss_history = [] 

 epochs = 10000
 for epoch in 1:epochs
    
 Flux.train!(loss, model, data, opt_stat)     
        
       # print report
    train_loss = Flux.mse(model(x), y)
    push!(loss_history, train_loss)
    println("Epoch = $epoch : Training Loss = $train_loss") 
            
  end

ŷ = model(x)
Flux.mse(model(x), y)
Y = reshape(ŷ, (100, 1));
scatter(Y)

Your model is a linear function, because Dense(in => out, σ=identity) defaults to identity for its activation function. Try model = Chain(Dense(5 => 100, relu), Dense(100 => 1, relu)) for instance. That gives an excellent fit.

The identity in the last layer of the Chain doesn’t do anything by the way.

1 Like

thanks, I assumed that the dense function would already integrate by default the sigmoid function.