To understand neural ODE

vavrines · March 30, 2021, 2:48pm

I find I still don’t fully understand neural ordinary differential equations.

As shown in literature (Lu et al., 2017; Haber and Ruthotto, 2017; Chen el al., 2018), a sequence of
transformations
h(t+1) = h(t) + f(h,theta)
can be turned into an ODE in the limiting case
dh/dt = f(h,theta)

This formulation holds only under the condition that theta is a function of time.
This way we can say we’ve got an equivalent block of infinite depth.
However, in applications we just define a network with a fix number of parameters.
For example, in the following codes,

using DiffEqFlux, OrdinaryDiffEq
dudt = FastChain(FastDense(2, 50, tanh), FastDense(50, 2))
u0 = Float32[2.0; 0.0]
tspan = (0.f0, 1.f0)
nn = NeuralODE(dudt, tspan, Tsit5())

nn.p is a fix-sized vector with 252 entries.
Although the output of NN is feed into next iteration as input in the Tsit5 solver, the old parameters are still there.
Does the neural ODE hold better learning capacity than a single layer, or they are just the same in this case?

ChrisRackauckas · March 30, 2021, 4:59pm

It’s n layers, where n is determined adaptively by the neural network.

vavrines · March 30, 2021, 5:09pm

Could you elaborate a bit on the adaptivity?
To me it seems everything is fixed after nn is defined in my demo code.

ChrisRackauckas · March 30, 2021, 5:40pm

It defines and ODE which is solved by an adaptive ODE solver. Depending on what steps the ODE solver decides to take, that’s how many layers you effectively have.

vavrines · March 30, 2021, 6:31pm

Yeah now I got what you mean. In fact this is exactly what I’m asking.
Sure with different ODE solvers we can have different number of steps, or say “n layers”, but all these layers rely on the parameters of dudt = FastChain(FastDense(2, 50, tanh), FastDense(50, 2)).
Are they really as good as a neural network that has n real layers (with of course the geometrically increasing number of parameters), e.g. FastChain(dudt, dudt, ...)?

ChrisRackauckas · March 30, 2021, 6:37pm

It’s then similar to recurrent models.

Topic		Replies	Views
NeuralODE Layer Strategy New to Julia machine-learning , differentialequation	4	920	November 16, 2021
Training a NeuralODE with an ODE depending on exogenous time-dependent input Performance question	5	363	October 12, 2023
Neural ODEs and ODE parameter estimation Machine Learning	3	506	May 30, 2024
To normalize or not to normalize a Neural or Universal ODE General Usage question , ode	2	751	May 20, 2022
Lipschitz continuity -- Neural ODEs Machine Learning discussion	2	527	April 7, 2024

To understand neural ODE

Related topics