Hi, after working with ordinary differential equations so far, I now have to numerically solve a partial differential equation (PDE) in Julia, and I’m not sure where to start.
My equation is like the usual wave equation in physics, with extra bells and whistles.
The plain wave eq’n is: Ftt - (c^2 * Fxx) = 0
where F is a function of t and x, and Ftt means the 2nd derivative of F with respect to t, and so on. My modified equation is:
Ftt - (c^2 * Fxx) = G(t,x) * F(t,x)
where G(t,x) is another function that is complicated (non-separable). It contains exponentials and sine/cosines in (t-x) and a polynomial in t.
I need an algorithm(s) that provides stable, accurate results, without blowing up my computer’s memory. Does anyone know what PDE solver(s) Julia has that are best for this, and where I can find documentation on how to use them? Thanks!
I don’t know what you mean by the domain being Cartesian or not.
The problem is from general relativity, the coordinates are definitely not Cartesian.
But I’ve reduced the problem to equation above, and coordinates t and x scale the
same way as each other, so probably t and x could be sampled with equal fineness. Is that what you mean?
Discretize in space, either by hand or with something like ApproxFun or DiffEqOperators, and then solve the remaining ODE with the ODE solver. that’s a semilinear wave equation so there’s some stuff to look up for this.
There are so many ways to discretize this, it depends a lot on how much accuracy you require and on the values that G can take on. For example, for moderate accuracies and bounded |G| it should be pretty easy to derive a stable explicit timestepping scheme (most commonly for wave equations you use a second-order explicit scheme). Or you could use method-of-lines (discretizing space but then plugging it into a generic ODE solver for time), which may be a good idea if G is large enough to make your equations stiff. If the values of G are such that the solution takes on extreme behaviors in small regions of space or time, of course, then you may need fancier nonuniform discretizations.
And, of course, your equations are linear so you could Fourier transform (t \to \omega) and solve it as system of linear equations in the frequency domain, discretized in space and solved with \. Since it looks like you have only one spatial dimension with smooth coefficients, something like the ApproxFun package could potentially lead to an extremely efficient discretization.
Probably the simplest is to just define the equation symbolically and train a physics-informed neural network on it. It’s not necessary the most efficient, but if you’re looking for something easy that would do it. Examples:
Maybe the simplest to set up, but probably the hardest to validate — would take a lot of work, and probably require implementing a separate solver for comparison , before I would trust the answers, and even then only in a restricted regime corresponding to the training data. (In contrast, for a method with known convergence properties, it is usually sufficient to validate on one or two cases with known solutions to eliminate bugs.)
Fitting approaches become especially difficult for wave equations as the diameter of the domain increases relative to the wavelength, because the increasing number of possible resonant/interference effects rapidly enlarges the effective dimension of the parameter space.
I see neural networks as a potential way to accelerate the solution of PDEs in cases that you have to re-solve many times for similar parameters. And maybe as a way to fit empirical data for which good models are not yet known.
What about if you use the stochastic sampling training strategy? I agree it has no rigorous bound that ca be provided at the end which makes it prone to questions, but if many different random samples all have low error for the cost function and a plot reveals something that’s fairly smooth, would that be enough?
FWIW, one of the next things we’re doing in the library is making it estimate where the cost function is the highest. This is mostly to accelerate the training process to enhance the sampling, but it turns out to be a lot like adaptive Monte Carlo schemes for multidimensional quadratures that methods for adapting the training also can be used to give (somewhat loose) error bounds.
My concern is that if the space you are sampling is incredibly high dimensional, no sampling scheme can provide reliable convergence with a reasonable number of samples. All high-dimensional Monte-Carlo methods have this problem—in high dimensions, something needs to be very special about your function for them to work well.
That being said, the PINN approach is essentially expanding the solution in some basis (in this case a NN) and minimizing the mean-square residual at some set of sampled points. From a certain perspective, especially for time-independent problems, this could be viewed as being much like a collocation method where you solve the resulting equations Ax=b by re-casting them as a least-square problem of minimizing \Vert b - Ax \Vert^2 …in principle, this is a viable method. But that comparison immediately raises two concerns:
Why is a NN a good basis for representing the solution, compared e.g. to spectral methods or finite elements? If you have a smooth function f(\vec{x}) for a d\le 3 dimensional box domain \vec{x} \in [0,1]^d as in most of the PINN examples, surely an exponentially convergent spectral basis would be vastly more efficient (and probably more reliable)? And if there are discontinuities in the domain (e.g. piecewise constant coefficients common in engineering problems with multiple materials) or an oddly shaped domain, why shouldn’t we use finite elements? What do NNs bring to the table in low dimensions for non-noisy functions?
Solving a linear equation by gradient-descent methods (e.g. Adam) on the least-square error seems like it effectively squares the condition number of A, much like the conjugate-gradient-squared (CGS) method. In most cases this kind of approach was abandoned decades ago for solving large PDEs because of its slow convergence — why should we resurrect it now? (A more sophisticated minimization-based approach might be something ala GMRES, but this seems beyond the reach of traditional NN training algorithms because they don’t know about the underlying linearity of the problem. It might be possible to develop a GMRES-like approach specifically for PINNs, though?)
I would definitely agree that a PINN isn’t the most efficient way to do this. It’s true strength really comes from the fact that it’s fairly easy to implement for “weird cases”, like partial differential-algebraic equations, boundary conditions on the interior of the domain, non-local PDEs like integro-differential equations. FWIW, that symbolic PDE interface is going to have finite difference, pseudospectral, and FEM (via FEniCS.jl) discretization options in the near future, but PINNs were the first one done (because of how simple it generalizes to so many cases). I was mostly just recommending it because if someone is going to ask how to solve such a PDE, then I was assuming it’s not for the most efficient way but the “easiest way”, and this is probably the closest to NDSolve-like automation that exists in Julia right now (still primitive but getting better). It’ll at least make the pretty pictures in a way that requires little effort, though the solution should be double checked somehow.
Theory aside, it would be interesting to see if PINNs can empirically give a lower dimensional representation of the field compared to a discretization-based approach for the same accuracy or a more accurate representation for the same number of parameters using some classic PDE examples. Then maybe they can be combined together using “large” elements and an NN inside instead of traditional shape functions. In a lot of practical PDE applications, the number of elements can get ridiculously large and the memory requirements are huge.
What a lot of practictioners do to “detect” convergence is using mesh sensitivity analysis. For example, using a finer mesh and making the sure the solution is not too different. So if memory savings can be made with a hybrid FEM-PINN approach, then we can use the mesh sensitivity analysis approach to develop some confidence in the field approximation obtained.
I can tell you right now it won’t come close. PINNs are an extremely compute-wasteful methodology: they will take plenty more parameters on low dimensional spaces than something more tuned like spectral elements, or even finite differencing. For reference, I’ve discussed with a researcher in this area and we’ve now gotten a GPU-accelerated PINN solving the Lotka-Voltera equation to 3 decimal places in about 240 seconds, compared to a serial ODE solver to like 8 digits in a few microseconds. ODEs are the worst case scenario, but even the heat equation (with not too big of a domain) is a sub-second solve with finite difference methods but a few GPU minutes for a PINN. If there is good theory around the problem, PINNs aren’t good and won’t be good. They should be efficient when the PDE is high dimensional (Schrodinger equations, Kolmogorov equations, … essentially things that are evolving probability distributions over a large number of states), or if the PDE is non-local and good theoretical methods don’t exist (fractional PDEs, integro-differential equations, etc.)
The only reason why I recommend it here is it seems like the OP might just want one solution without much brainpower put into it. In that case, the use case is like Mathematica’s NDSolve, and PINNs are pretty good for building functionality like that (since it’s trivial to generate a loss function for “any” PDE), so the ModelingToolkit general PDE interface’s first generally usable method is a PINN. It’ll sap more electricity than it should, but it’ll let you describe the PDE symbolically and spit out something that can make plots. @stevengj brings up a really good point though that you should manually check the result because PINNs do not have good theoretical bounds on the result. In fact, the current results are essentially just that they converge but not necessarily with any good error bounds or with a high convergence rate. These results also don’t tell you whether it’s converged to the right solution, say a viscosity solution, so if you have a PDE with potential issues of that sort you will need to take more care (but this is only semilinear so if I’m not mistaken you can prove uniqueness under mild assumptions like L2 of the added function using a similar approach to what’s done on semilinear heat equations).
So there are a lot of caveats, but I think we can at least get error estimators on the result. This is still quite an active project though.
You can’t really take a finer mesh with a PINN because it’s a mesh-free method. This is where the suggestion of random sampling in other ways comes from: indeed you should quantify the error at different points from how you trained the network. It turns out to be quite similar to adaptive Monte Carlo quadrature methods, which do have error estimators but they have hard to predict failure conditions (for example, if you have a thin mass with all of the density and none of the initial points hit it, AMC quadrature methods will calculate that the integral is zero with zero variance and exit…). That’s not necessarily the kind of behavior you will see in PDEs, but if a PDE has something like a kink in the domain, you might expect that kink to be where the maximal error is at but purely stochastic methods may never sample the loss there (or around there), so you do have to be careful. This is something we’re looking a bit more closely at.
But even then, it’s not going to be efficient (on “standard PDEs”), just hopefully very automatic.
Hi again, it seems that my question started an energetic discussion. Thanks for the suggestions. I think I should clarify a few things:
I’m not looking for a quick-and-dirty solution. I need an absolutely unimpeachable solution to this equation, a “bet my reputation on it” solution. This equation is the only one I’m focused on. So any method that needs other methods to validate it, or has questionable theoretical underpinnings, won’t do. (Neural Networks sounds interesting but very questionable for this.) I need a precise, accurate answer that I can be confident in, and won’t be questioned.
The problem is in 2 dimensions, t (time), and the spatial x-axis. The function G(t,x) is always positive, but otherwise very nasty. It goes from very large, to small, to very large again; and it is also always oscillating. For fixed x, and a computation region t=[0.1, 3.0], G(t,x) varies by more than 6 magnitudes (factor > 1 million). For a square computation region x=[-3, 3] and t = [0.1, 3.0], G(t,x) varies by more than 10 magnitudes.
The reason I’m asking this question isn’t because I’m trying to be clever, but because I am new to Julia (and to numerical work on PDE’s), and I didn’t want to kludge something together that might produce nonsense results. I was hoping that Julia had a “Plug and Play” PDE solver for two-dimensional problems that I could just plug my equation into and trust the results – the way I did it for ODE’s. That seems not to exist. What about “PDESolver.jl”? The name sounds promising, but I found little documentation.
If I have to turn the the PDE into an ODE by discretizing x, the three suggestions seem to be (i) Doing it “by hand”; (ii) Using ApproxFun; (iii) Using DiffEqOperators. I don’t know which of those 3 options I should choose (or why), I’m not entirely sure what they mean (“by hand”?), or why using ApproxFun would be better than using the actual function G. I’m also not sure how to implement any of them. I’m not sure how to turn a call to an 1-dimensional ODE solver with only time derivatives, into a 2-dimension solver that somehow imports a grid of x-values in the differential equations for t.
I’ve just been using Julia since the Spring, any more detailed information would be welcome. Thanks!
The reason is because there are so many things you can do in PDEs that you can do to specialize the solver that it’s hard to make everyone happy. That said, we do have such an interface, and the documentation + blog post that describes in full, which answers the question of “what about my PDE method X” or “linear solver” …, it’s a very deep (multiple-dispatchy) answer to how to solve this.
NeuralPDE.jl is just the first instantiation of the interface that is complete. Finite differences, which is actually what Mathematica’s plug-and-play NDSolve does under the hood, is being developed for this interface as well: (https://github.com/SciML/DiffEqOperators.jl/pull/250, and notice the strength of the interface is how it can compose methods from other packages), with finite elements soon to follow. So a general Julia-wide PDE interface that incorporates a ton of different packages is almost complete, hold tight for another week to see what that’s all about, but note that even when it comes out there’s still more work to do.
Ok, it seems like you’re integrating by parts, if the “Rx” in the 2nd term in the 1st equation is actually “Fx”, and the last term is the integral of (etaGF).
But what is eta, what region is Omega (and boundary Omega = dOmega),
and how can I “assume some function F(x)” when F(t,x) is the function in the differential equation that I’m solving for in the first place?
Also, how can I use an ODE solver to solve a differential equation that has integrals in it??