Why do we need 3 chains to solve a PDE using NeuralPDE

affans · December 28, 2023, 4:11pm

I am working through examples from NeuralPDE.jl package and have a question. I am looking at the inverse problem example (i.e. parameter estimation) and just need to get my thinking straight. This is more of a conceptual/theory question than it is a Julia one.

Let’s consider the Lorenz system as given in the example.

\begin{aligned} x\prime &= \sigma(y - x) \\ y\prime &= x(\rho - z) - y \\ z\prime &= xy - \beta z \end{aligned}

The parameters to be estimated here are \sigma, \rho, and \beta. From my understanding of the theory of PINNs, the underlying NN would have an input dimension of 3 (corresponding to time t) and output dimension of 3 (corresponding to the outputs x, y, and z. In other words, there is just one NN that is being trained.

However, in the example, I see there are three independent NN being trained each with an output dimension of 1 (so three neural networks corresponding to x, y, and z).

 chain1 = Lux.Chain(Dense(input_, n, Lux.σ), Dense(n, n, Lux.σ), Dense(n, n, Lux.σ),
                       Dense(n, 1))
 chain2 = Lux.Chain(Dense(input_, n, Lux.σ), Dense(n, n, Lux.σ), Dense(n, n, Lux.σ),
                       Dense(n, 1))
 chain3 = Lux.Chain(Dense(input_, n, Lux.σ), Dense(n, n, Lux.σ), Dense(n, n, Lux.σ),
                       Dense(n, 1))
 discretization = NeuralPDE.PhysicsInformedNN(
        [chain1, chain2, chain3],
        NeuralPDE.GridTraining(dt), 
        param_estim = true, # whether the parameters of the differential equation should be sent to the additional_loss function
        additional_loss = additional_loss)

I don’t really understand why we would need three chains, each with a single output (i.e., dim = 1) instead of just one chain with 3 outputs. The documentation does say

chain : a vector of Flux.jl or Lux.jl chains with a d-dimensional input and a 1-dimensional output corresponding to each of the dependent variables.

but it does not answer my question. Would anyone recommend suggested readings?

Edit: In the dev version of the documentation, there is now an example of parameter estimation of the Volta Lotkerra model (a system with 2 equations, 4 parameters) in a Bayesian framework. In this example, the NN is exactly what I expected it to be; i.e., input of dimension 1 and output of dimension 2.

So now the documentation has two examples of similar models, but one that uses BNNODE interface (with a single NN) and the other that uses PhysicsInformedNN interface (with multiple NN, each NN corresponding to an equation).

@ChrisRackauckas tagging you for visibility and would appreciate your comment on this.

Maucejo · December 28, 2023, 5:29pm

You can find the answer here NeuralPDE systems of PDEs. In short, it is because you have 3 equations, so 1 NN per equation.

affans · December 29, 2023, 6:20pm

Thanks!

I am wondering why in one of the examples they are able to use a single neural network even though the system has more than 1 equation (see here). This example uses the BNNODE API but I was wondering if it’s possible to convert this to using the PhysicsInformedNN API as well (I am guessing that PhysicsInformedNN is more primitive and should be able to handle the Bayesian example as well)

ChrisRackauckas · December 30, 2023, 10:44am

The ODE formulation specializes on the fact that everything is only differentiated once, and everything is differentiated once.

No. This is addressed in the writeup that explains NeuralPDE.jl.

The summary of that part is really that with high order automatic differentiation you get really bad scaling. It’s multiplicative between the number of outputs and the order. So you really only want to differentiate what you need. If you pool everything together in one neural network and only one term needs a second or third derivative, you’re pretty much stuck differentiating everything to third order which has a cubic cost. Splitting the networks gives only a linear cost growth, so it’s much cheaper.

I do plan to make it so it can be a choice in the future, but that’ll take some work.

affans · May 7, 2024, 2:43am

@ChrisRackauckas

I spent some time today reading the write up NeuralPDE (arxiv.org) and also this paper on using ANNs to solve ODEs and I have a fairly good understanding now, but I am still not sure about what you mean by “differentiated only once”. I have two questions

If I have a system of ODEs with constant parameters, should I stick with the NNODE API to solve the system?
When is it required to use the PhysicsInformedNN API?

And I may create another topic for this, but what if my system of ODEs has a time-dependent parameter? Consider for example, this basic SIR system

S' = -b(t) S I 
I' = b(t) S I - g I

Given data for I I’d like to solve the inverse problem of finding b(t) but I am not sure how to use the package to do that (it seems to me that I need another NN for b(t))

Thanks in advance for your help, if you have the time!

ChrisRackauckas · May 7, 2024, 6:38am

Yes

PDEs

That’s physics-informed neural operators territory, which should get added soon

github.com/SciML/NeuralPDE.jl

[WIP] Physics informed neural operator ode

SciML:master ← SciML:pino_ode

opened 05:06PM - 12 Feb 24 UTC

KirillZubov

+558 -3

Here is implementation of the method that combine training data and physics con…straints to learn the solution operator of a given family of parametric Ordinary Differential Equations (ODE). https://github.com/SciML/NeuralPDE.jl/issues/575 Checklist - [x] pino ode - [x] family ode by parameter - [x] physics informed DeepOnet - [x] tests - [x] addition loss test - [ ] doc https://arxiv.org/abs/2111.03794

affans · May 11, 2024, 8:37pm

Thanks @ChrisRackauckas, so in the documentation there are two examples of inverse problems.

Under the ODE section, this example uses the NNODE API to optimize the parameters of a system of ordinary differential equations. In this example, a single NN with output dimension equal to the number of equations is used.
Also, under the PDE PINN section, this example uses the PINN API instead for the inverse problem (though it’s still a system of ordinary differential equations), with 3 separate chains corresponding to the three solutions x(t), y(t), z(t).

Should we consolidate or get rid of the second example? At the very least, perhaps we can say what’s different between the two. I am happy to create an issue/PR if you let me know what the main differences are here.

ChrisRackauckas · May 11, 2024, 9:04pm

That one is fine. It should probably be replaced with a PDE one though.

Topic		Replies	Views
Use multiple output neural network in NeuralPDE General Usage	2	47	July 29, 2024
Training Universal PINNs using NeuralPDE? Machine Learning diffeq , biology , sciml , neural-network	4	188	February 18, 2025
ODESystem usage in NeuralPDE/NNODE New to Julia	3	69	October 29, 2024
Parallel computing and GPU support in neuralPDE.jl package New to Julia package , performance , neural-network	49	1592	November 22, 2023
Can NeuralPDE use neural nets with more than 1 outputs? Modelling & Simulations question	2	173	October 23, 2023

Why do we need 3 chains to solve a PDE using NeuralPDE

Related topics