Comparing RNNs versus Neural ODEs for time series prediction

Hey folks.
Say, I was playing around with some simple NeuralODEs (NODEs) in DiffEqFlux, and have worked with
RNNs in the past. I was trying to understand the difference in prediction accuracy between RNNs and NeuralODEs? Meaning I am trying to separate the fact from fiction about NeuralODEs.

The original Chen et al. paper on Neural ODEs describes NODEs as the continuous extension of a Residual Network or ResNet. ResNets can have many many layers because of their use of skipped connections, and other architectural features. Now, in practice NODEs are used for continuous time-series models–this is one of the key applications advocated in the original paper. Of course time series starts to fall into the neighborhood of Recurrent Neural Networks (RNNs).

I don’t think there is a definitive statement on whether RNNs perform better on time series prediction tasks versus NODEs, but I was hoping someone with more experience than I might have some anecdotal evidence to suggest which models perform better–or whether both models are roughly equivalent? Indeed, in reading the Augmented Neural ODE paper, the authors suggested that RNNs–because they are discrete–are actually more expressive than an NODE, due to the topological limitations of an NODE. But then that claim begs the question of whether Augmented NODEs have better prediction accuracy compared to RNNs.

1 Like

The short version is “it depends”. For some problems something discrete is a better inductive bias. For some problems something continuous is a better inductive bias.

c.f. Chapter 1 of On Neural Differential Equations for a bit of this kind of summary. (+Chapter 3 for connections to RNNs.)


@patrick-kidger Yes, thanks for the direction. So it depends on the problem. I was actually reading your thesis, which you linked too–it is really very nice. I am halfway through chapter 2 so now I can look forward to chapter 3. Your comments are very helpful though, since it suggests that there are different tradeoffs such as training time, versus accuracy, versus …