Solving for the steady-state of a large nonlinear ODE in SciML

ctkelley · February 8, 2021, 12:17pm

I’ll have a matrix-free version of pseudo-transient continuation ready soon (mid-March at the latest). You’ll need a good preconditioner for it to work well.

Support for sparse matrix factorization PTC will be in the next release (withing days) and is already in the master branch. If you can store a sparse Jacobian, my stuff should do the job for you.

The current release should support general sparse Jacobians but the upcoming release has a working example and does the job for certain.

rveltz · March 23, 2021, 8:13am

Thinking about this again, is the jacobian a Fredholm operator? I am thinking that T is a differential operator and G is a compact operator. In this case, you can use a GMRES solver with a Matrix-free jacobian using T as a preconditioner. You can save the (incomplete) LU decomposition of T before hand.

I use the same strategy here and here (scroll to the end) or here!

ctkelley · March 23, 2021, 9:23am

The Jacobian (Frechet derivitive) of a nonlinear differential operator, for example, is also a differential operator, which is not Fredholm. So if

F(u) = u'' + f(u) + boundary conditions, then the Frechet derivative acts on a function via
F'(u)v = v'' + f'(u) v. This is an unbounded operator and is not Fredholm

For integral opreators like F(u)(x) = u(x) = \int k(x,y,u) u(y) dy, you get
F'(u)v(x) = v(x) + \int \frac{\partial k}{\partial u} (x,y,u) v(y) dy, which is a compact perturbation of the identity and Fredholm.

antoine-levitt · March 23, 2021, 9:34am

I haven’t followed the discussion but I think Romain means that (in this case) you precondition with the Laplacian, so (1-Delta)^-1 (u’’ + f’(u)) is Fredholm.

rveltz · March 23, 2021, 9:37am

Yes I was a bit quick, I mean you can precondition with the differential operator whose resolvent is often compact.

Thanks @antoine-levitt, we were typing at the same time it seems

ctkelley · March 23, 2021, 9:45am

OK. That’s right.

briochemc · March 23, 2021, 11:23am

Thank you for the suggestion! OK so I’m not very fresh on topological spaces (compactness and all that), so I’ll try to avoid discussing these abstract concepts, but don’t hesitate to tell me if I should think more about it and to correct me if I’m wrong. Again, I’ll lay out a detailed take on what I’m doing or trying to do, so apologies for the length.

I’ll start from the equation for the Jacobian that I already had above:

J(x) = \begin{bmatrix} (\partial_1 G_1)(x) & (\partial_2 G_1)(x) & \cdots & (\partial_N G_1)(x)\\ (\partial_1 G_2)(x) & (\partial_2 G_2)(x) & \cdots & (\partial_N G_2)(x)\\ \vdots & & \ddots & \vdots \\ (\partial_1 G_N)(x) & (\partial_2 G_N)(x) & \cdots & (\partial_N G_N)(x)\\ \end{bmatrix} - \begin{bmatrix} T_1 & & & \\ & T_2 & & \\ & & \ddots & \\ & & & T_N\\ \end{bmatrix}

where the G_i(x_1,\ldots,x_n) are functions of the vectors x_i and the \partial_iG_j are the corresponding (sub-block) Jacobians, which are all diagonals. The T_i are matrices that, as you say, represent differential operators (but discretized onto a finite 3D grid and vectorized (as in, rearranged from 3D to 1D)).

A sketch of what it might look like in ASCII art :

         ⠑⢄    ⠑⢄        ⠳⣄
J(x) =   ⠑⢄ ⠑⢄ ⠑⢄   +       ⢿⣷
            ⠑⢄ ⠑⢄              ⠑⢄

     =     ∇G(x)    +       T

I’m assuming there was a bit of confusion with the names, but I guess by G you meant the first term in that equation above (that I write as ∇G(x)). Assuming that’s correct, then I think T might fit your criteria, although my T contains at least one T_i block on the diagonal that is divergence-free, such that T_i * ones(n) .== 0 (where n = size(T_i, 1)). This means the null space of T_i — and thus that of T — is not trivial, which I think might be a problem for doing LU (or iLU too?) on T. (It is thanks to the diagonal terms that arise from the \partial_iG_i that the null space of the whole Jacobian J(x) is actually trivial and allows me to “invert” it.)

However, I think I can get away with a preconditioner by rearranging things a bit. Indeed, I can remove a constant piece from the \partial_i G_j and move them into T such that it is invertible and amenable to iLU preconditioning. That’s because the blocks for which the T_i are non-invertible also come with a corresponding G_i that has a linear term. So I can write them as G_i(x_1,\ldots,x_N) = \tilde{G}_i(x_1, \ldots, x_N) + \lambda_i x_i, where the \lambda_i x_i term has Jacobian \lambda_i I. Sticking this \lambda_i I into a new \tilde{T}_i = T_i - \lambda_i I allows me to rewrite J(x) as

J(x) = \begin{bmatrix} (\partial_1 \tilde{G}_1)(x) & (\partial_2 G_1)(x) & \cdots & (\partial_N G_1)(x)\\ (\partial_1 G_2)(x) & (\partial_2 \tilde{G}_2)(x) & \cdots & (\partial_N G_2)(x)\\ \vdots & & \ddots & \vdots \\ (\partial_1 G_N)(x) & (\partial_2 G_N)(x) & \cdots & (\partial_N \tilde{G}_N)(x)\\ \end{bmatrix} - \begin{bmatrix} \tilde{T}_1 & & & \\ & \tilde{T}_2 & & \\ & & \ddots & \\ & & & \tilde{T}_N\\ \end{bmatrix}

where the \tilde{T}_i are all invertible so iLU should work (I think). But I’m not sure:

Are the criteria that you mention met? (Do they strictly need to be?)
Is \nabla\tilde{G}(x) compact?
Is the resolvent of T compact? (I don’t know what the resolvent is actually so I’d love a small explanation if possible, otherwise, I have been trying to go to bed for 1hr+ now so I’ll leave it at that and come back to this later )

Another question comes to mind though. When LU + backsolve work (i.e., when the matrices are small or sparse enough and I have enough memory), then shouldn’t I stick with that instead of going for Krylov? I thought GMRES would be slower in this case. (FWIW, GMRES and co. are part of composability that I am looking for in the future, but I don’t think I need it right now.)

ctkelley · March 23, 2021, 11:34am

If you can precondition with a fast solver for the high-order term, Krylov linear solvers will almost always do better than a direct method for this kind of problem in terms of both time and storage.

rveltz · March 23, 2021, 11:39am

Are the criteria that you mention met? (Do they strictly need to be?)

Without knowing the underlying equation, it is tough. But you have everything (numerically) at hand to try it seems. It should not take you long to try given what you coded already

I don’t know what the resolvent is actually so I’d love a small explanation if possible

the resolvent of T is (\lambda - T)^{-1} for \lambda\in\mathbb C not in the spectrum of T.

then shouldn’t I stick with that instead of going for Krylov?

Exactly, that’s the point. You basically dont need the full jacobian, just the Matrix-vector product.

antoine-levitt · March 23, 2021, 11:43am

Here is Fredholm theory in a nutshell, because it confused me for a while before I understood what it was (of course the following is a sketch, but it gives the intuition for the definitions). An operator (smooth, on a bounded set) is compact if it has a negative order of derivatives. What this means is that its eigenvalues will accumulate to zero as the discretization is refined: high frequencies get sent to zero by the negative order. A Fredholm operator is (most often) something like 1 + K, where K is compact. Such operators will have spectrum accumulating at 1, which Krylov solvers love. So if you have a differential operator like u → -Delta u + v(x) u or something, it’s not compact because the high frequencies will go to infinity, and Krylov solvers are not going to like that. OTOH L = (1-Delta)^-1 (Delta u + v(x) u) is 1+K with K negative order of derivatives (and therefore compact), so L is Fredholm. That’s just a fancy way of saying that you have effectively preconditioned the highest-order term (the laplacian) so that the remaining bits will not blow up as you refine the mesh.

briochemc · April 19, 2021, 5:49am

It took a while but I think I now have working code to build in-place Jacobians for AIBECS. (FWIW, I opted for some hacky grouping and stacking of operators from DiffEqOperators.jl.) Anyway, right now, I would like to test different solvers. Following

I’m thinking that a package to wrap them all would be the ideal framework for me. So my question is: When do you think NonlinearSolve.jl will provide/wrap all these solvers?

And again, thank you all for all your contributions in packages, tips, explanations, etc.

ChrisRackauckas · April 19, 2021, 6:50am

NonlinearSolve.jl has its own methods, Sundials.jl’s KINSOL is on this interface. NLsolve.jl is handled in https://github.com/SciML/SciMLNLSolve.jl , with MINPACK dispatches missing but fairly close. Finishing the MINPACK version should only take a few minutes for whoever has the time. https://github.com/ctkelley/SIAMFANLEquations.jl could happen soon? All of these should automatically have adjoints BTW through the SciML interface. Then we need to document it like DifferentialEquations.jl as common interface + solver choices.

ctkelley · April 19, 2021, 10:01am

SIAMFANLEquations.jl is on schedule and the Newton-Krylov stuff is mostly done. I will tag a new release after the print book → notebook map is done, probably later this week.

The schedule is codes done by end of July and all goes to the publisher by 12/31/21.

Topic		Replies	Views
Nonlinear systems: NLsolve alternative (non-convergence issue) Optimization (Mathematical) question , nlsolve , nonlinear	3	1041	January 25, 2021
What nonlinear solvers are available? Specific Domains nlsolve	11	3037	January 11, 2022
Solving nonlinear equations using inexact Newton — seeking recommendations General Usage question , nonlinear	5	178	September 26, 2024
Newton + LineSearch in Hot Loop: Reduce allocations Numerics newton-raphson , nonlinearsolve	6	123	November 14, 2024
Nlsolver vs Matlab fsolve Optimization (Mathematical) question	6	253	September 17, 2024

Solving for the steady-state of a large nonlinear ODE in SciML

Related topics