I am trying to work out a simple application of NeuralODE for regression problems, based on the following tutorial.
Core idea here being (as per my understanding) that regression problems can be modelled as
with nn and x being the neural network and the independent variable.
Given below is my attempt to fit a simple parabola to a Neural ode based on 4 layer deep neural network. (Inspired from the MNIST example)
using DiffEqFlux using Flux x = collect(-10:1.0:10) y_true = 2.0 .* x.^2 .+ 10.0 nn = Chain( Dense(1,2,relu), Dense(2,2,relu), Dense(2,2,relu), Dense(2,1) ) n_ode = NeuralODE(nn, (0.0, 1.0), Tsit5(), save_everystep=false, save_start = false) dataset = Flux.Data.DataLoader((x,y_true),batchsize=1) model = Chain( (x) -> x, n_ode, (x) -> Array(x) ) function loss(x,y) return Flux.Losses.mse(model(x),y) end opt = Flux.Optimise.ADAMW(0.01) function cb() l = 0.0 for i = 1:length(x) l += (y_true[i] - model([x[i]]))^2 end println("Loss: $l") # @save "NonLinearModel.bson" model return false end Flux.@epochs 1000 Flux.train!(loss,Flux.params(n_ode.p),dataset,opt,cb=cb)
But my program saturates around loss value of ~75000 and never learns. Any help or comments in this regards is welcome. Thank you
Also as per the
Flux.train! example here to use Flux.train! function we need to destructure the neural network and call the
ODEProblem function. But would it not call the DifferentialEquation ODEProblem, thus generating huge gradient back propagation graph for each iteration of ODE
solve function. Where would it utilize the augmented dynamics of NeuralODE? (I apologize in advance if any of the above is obviously wrong I still havent completely figured out NeuralODEs!)
Julia version = 1.6.1
DiffEqFlux = 1.41
Flux = 0.12.6