DiffEqFlux: neural_ode stops prematurely

fastwave · January 23, 2019, 5:03pm

I am trying to replicate the example in the README of DiffEqFlux https://github.com/JuliaDiffEq/DiffEqFlux.jl. Calling the neural_ode generated function makes Julia exit before training could begin.

The code is

using DifferentialEquations
using Flux, DiffEqFlux

function lotka_volterra(du,u,p,t)
x, y = u
α, β, δ, γ = p
du[1] = dx = αx - βxy
du[2] = dy = -δy + γxy
end
u0 = [1.0,1.0]
tspan = (0.0,10.0)
p = [1.5,1.0,3.0,1.0]
prob = ODEProblem(lotka_volterra,u0,tspan,p)
ode_data = Array(solve(prob,Tsit5(),saveat=0.1))

dudt = Chain(Dense(2,50,tanh),Dense(50,2))
tspan = (0.0f0,10.0f0)
n_ode = x->neural_ode(x,dudt,tspan,Tsit5(),saveat=0.1)

function predict_n_ode()
n_ode(u0)
end
loss_n_ode() = sum(abs2,ode_data .- predict_n_ode())

data = Iterators.repeated((), 100)
opt = ADAM(0.1)

cb = function () #callback function to observe training
display(loss_n_ode())
end

println(“Before crashing”)
n_ode(u0)
println(“After crashing”)

ChrisRackauckas · January 23, 2019, 7:04pm

That’s the old (yesterday night before we released) syntax. Basically, swap x and dudt:

n_ode = x->neural_ode(dudt,x,tspan,Tsit5(),saveat=0.1)

Where in the docs do we have this? It would be good to fix that.

Edit: Fixed the docs. Thanks for the report!

ChrisRackauckas · January 23, 2019, 7:08pm

BTW, I’ll like to see what neural network you come up with to fit Lotka-Volterra. I was running the animations and recording them live on a core i5 laptop, so I kept it to the simple case . But when I did try to train LV with one hidden layer the NN didn’t seem big enough to capture the function. But on my laptop I couldn’t use the GPUs, so I’m interested to see what kind of NN can be used here .

(Also, there’s a much better way to train this, but that’s the topic for another publication)

fastwave · January 24, 2019, 3:40pm

Hi, thanks for the help, it works now. I cannot use a GPU either, simply because I don’t have one. So far my experience is that these networks are difficult to train. I don’t think it is the size or depth of the network. I think it is because of the nature of ODEs. Perturbations are amplified exponentially in time and that is hard to handle with any optimisation. Anyway, I will do some more experimentation before making a judgement.

My strategy would be to train with many short trajectories first and then improve on that with smaller number of longer trajectories. At the moment I have no clue how to do multiple trajectories, my modification of the loss function does not work. If you can give an example with two trajectories, that would be great. Thanks

ChrisRackauckas · January 24, 2019, 3:56pm

Yup that’s definitely the case.

That’s multiple shooting. We actually do that in DiffEq-proper: http://docs.juliadiffeq.org/latest/analysis/parameter_estimation.html. We will be putting a paper out on how to loss functions that improve the fitting. What the blog post shows is the training using single shooting which is what the paper shows, but we know that there are better ways .

Topic		Replies	Views
A simplified example for DiffEqFlux New to Julia	1	416	July 27, 2021
DiffEqFlux.jl: Questions about Neural ODEs Machine Learning question	1	292	February 10, 2023
Simple Regression using DiffEqFlux Machine Learning diffeq	6	588	September 8, 2021
Training a neural ode with unknow time span Machine Learning diffeq , diffeqflux	4	368	March 13, 2023
DiffEqFlux neural_ode used with Flux.Train! is slower on GPU than CPU GPU flux , ode	0	729	December 31, 2019

DiffEqFlux: neural_ode stops prematurely

Related topics