Very simple Flux model refusing to converge

I’m attempting to do something very simple for now. I just want to do a nonlinear curve fitting of sine using Flux.jl. So far, the model is completely refusing to converge, and is giving some very strange results. I must be doing something wrong because I’ve done this before and it worked very well.

using Flux, Plots, Statistics

timespan = 0:0.5:4*pi
out_dat = sin.(timespan)

hidden = 5

dat = [([x],y) for (x,y) in zip(timespan,out_dat)]

model = Flux.Chain(
        Flux.Dense(1 => hidden,relu),
        Flux.Dense(hidden => 1))	

opt_state = Flux.setup(Adam(), model)
loss(mod,x,y) = Flux.Losses.mse(mod(x), y)
mean([loss(model,x...) for x in dat])

meanerr = 100
i = 0
while meanerr > 0.1
    i = i+1
    Flux.train!(loss, model, dat, opt_state)
    if i%10 == 0
        meanerr = mean([loss(model,x...) for x in dat])

NNresult = vcat(model.([[t] for t in timespan])...)

plot(NNresult, seriestype = :scatter)

As you can see the problem is very simple. But when I try to run the code, even after literally thousands of epochs the convergence is terrible. For example after 6000 epochs I get the following output

I’ve tried it with all kinds of different settings, activation functions, and number of hidden neurons. It’s probably something simple I’m missing, so if anyone is able to spot anything wrong I would be very grateful. Thanks

It looks like I just needed to add some more layers, which surprised me, because I thought I had done it before with just one layer. Oh well.

1 Like

According to the Universal Approximation Theorem one layer should indeed be enough to fit virtually anything – if the number of nodes in that layer is “large enough”… which I guess 5 nodes isn’t :sweat_smile:

I actually tried it with many different numbers of hidden neurons. Yeah, that’s why it surprised me that such a simple thing would be having such trouble converging, so I thought I was doing something wrong. Turns out that training stuff just turns out to be difficult.