Why does DiffEqFlux.NeuralODEMM require a constraint equation and a mass matrix?

Hello everyone!

I’m looking at the “Enforcing physical constraints via universal differential-algebraic equations” example (Enforcing Physical Constraints via Universal Differential-Algebraic Equations · DiffEqFlux.jl), which trains a neural ordinary differential equation to solve a stiff ODE with constraints. The model is defined as follows:

model_stiff_ndae = NeuralODEMM(nn_dudt2, (u, p, t) -> [u[1] + u[2] + u[3] - 1],
                               tspan, M, Rodas5(autodiff=false), saveat = 0.1)

As best I can tell, the mass matrix M = [1. 0 0 // 0 1. 0 // 0 0 0] and the constraint equation (u, p, t) -> [u[1] + u[2] + u[3] - 1] convey the same information: that the state variables of the ODE must always sum to 1. Why are both supplied in this example?

Thank you!

The algebraic equation isn’t well-defined without the MM since it can be scaled.

@ChrisRackauckas Thank you, that makes sense. Can you also elaborate on the difference between the NeuralDAE (DiffEqFlux.jl/neural_de.jl at master · SciML/DiffEqFlux.jl · GitHub) and NeuralODEMM (DiffEqFlux.jl/neural_de.jl at master · SciML/DiffEqFlux.jl · GitHub) functions? I see that the former only requires specification of a constraint function, whereas the latter also needs the mass matrix. Does having the mass matrix enable more efficient calculation of adjoints or something?

One is semi-explicit while the other is fully implicit. NeuralDAE itself is a bad architecture and the MM form is more stable.