Training universal differential equations

Hello,

In a universal differential equation where the ODE + NN is not able to finish the integration routine, Is there a procedure (best practices) to execute in order to train the ODE?

If the integration procedure is not finished the evaluation of the loss will throw an error due to differences in the number of points to evaluate. This will most likely be because the estimation of the NN is not yielding good results and the system will possible become unstable.

Is there a procedure to adopt in this sort of situations? Any criteria to check if the ODE will be stable? Can the output of the predict function be in try-except block. If the integration works, do something, if not do something else? Will AD be able to account for this?

Iteratively growing the system is one good procedure:

https://docs.sciml.ai/SciMLSensitivity/dev/tutorials/training_tips/local_minima/

Or doing something like dividing the new term by a big constant (1000) so that it starts as approximately zero, presuming your model is stable when the NN is zero.

Also: train just your mechanistic component first. Then freeze its parameters and train just the neural network part.

This means that the mechanistic part of your neural ODE will learn as much of the physics as it can, and have the neural network part only learn the residual.

If you wish to learn anything about the values of the parameters of the mechanistic mode, or “learn the physics”, this approach will likely lead to biased parameter estimates, unless by chance the residual happens to be zero mean and close to normally distributed (in case of MSE loss). Having said that, it’s still probably worth doing if you’re experiencing issues with instability, just keep in mind that your “mechanistic model” is unlikely to represent some kind of “truth” or “physics”.

Hi,
Would you elaborate a bit on this? How can I freeze its parameters and train just the neural network part? Any documentation with this example?

this approach will likely lead to biased parameter estimates, unless by chance the residual happens to be zero mean and close to normally distributed (in case of MSE loss)

the try-except block? Or to approach are you referring to?

The approach to learn part of the model first, and then “complete the model” with a neural network later. The problem is that the parameters learned in the initial stage where the model is incomplete are likely to be wrong (a biased estimate).

1 Like

Can you elaborate a bit on how to achieve that?