# Solving for the steady-state of a large nonlinear ODE in SciML

I have a few questions:

1. What is the difference in scope between NonlinearSolve.jl and NLSolvers.jl/NLsolve.jl?
2. What are the pros/cons of NonlinearSolve.jl vs NLSolvers.jl/NLsolve.jl?
3. Are there any plans to include the algorithms of SIAMFANLEquations.jl into either/both of NonlinearSolve.jl and NLSolvers.jl/NLsolve.jl?
4. In particular, any plans to include the pseudo-transient continuation solver? Or maybe there are some other versions out there?
5. Are there other packages/solvers I should know about for solving large nonlinear systems?

For context, I deal with large-ish (up to ~1M) systems of nonlinear ODEs of marine tracers, for which I develop AIBECS.jl to, well, simulate these tracers. AIBECS essentially provides the tooling for users to create the functions that define the system of ODEs. In SciML’s lingo, it creates two functions of the state variable u and the parameters p:

F(u,p) = f(u, p, t) -> the rate of change of the system
∇ₓF(u,p) = jac(u, p, t) -> the jacobian of f with respect to u


As you may notice, there is no t dependency for my case of f and jac, and so most of the time, I am looking for the steady-state of these ODEs. To solve for that steady-state, I currently use my own rewrite of @ctkelley’s MATLAB nsold.m (which is probably not optimal, in the sense that my rewrite is not optimal, not his algorithms, to be precise! ). In other words, for a given set of parameters p, I feed x -> f(x,p,0) and x -> jac(x,p,0) to the newton solver to find the root x such that f(x,p,0) ≈ 0. Now this all works already quite well in my opinion, but I’m always looking to improve the package, and at the time of writing, it seems I have many choices to replace my non-expert code, among which

• NLsolve.jl by @pkofod, the most used package for solving large nonlinear systems in Julia AFAIK
• NLSolvers.jl, @pkofod’s own rewrite of NLsolve.jl, destined to replace it, but already available for those that want to live on the edge
• JuliaComputing/SciML’s NonlinearSolve.jl, which seems to have a large overlap with NLsolve.jl/NLSolvers.jl although it seems to be targeting specifically SciML problems
• SIAMFANLEquations.jl, which is @ctkelley’s own Julia version of nsold.m and co, being developed in parallel to writing the corresponding book. Particularly of interest to me is the pseudo-transient continuation solver, which I don’t think has any equivalent in other Julia packages, and is very relevant to my research, where the initial guess may be quite far and impede the Newton algorithm convergence.

Please forgive me if I butchered your work with my ignorant definitions and assumptions, I have the utmost respect for the developers/scientists involved with these and I think these are all amazing pieces of work/software. But back to my questions: I’m not really sure what I should be using in the future and keep asking myself these or similar questions, in my head or on slack, and I thought I should lay these down here for the experts (@pkofod, @ctkelley, @ChrisRackauckas and all the relevant SciML people, and whoever I should know about) to chime in if they want! 3 Likes

You might be interested in BifurcationKit.jl which is geared towards large scale problems. You have a Newton solver which works with sparse matrix or krylov methods for any precision (Float32…) and has been tested on GPU. Additionally, you have continuation routines which allow you to use homotopy methods.

5 Likes

SciML made a common interface for LinearProblem, NonlinearProblem, QuadratureProblem, and OptimizationProblem. The reason is because this is required in order to be a build target for ModelingToolkit, and also to allow for swapping the libraries internally (as you’ve seen with SteadyStateDiffEq.jl!). In the place of Quadrature.jl is a universal wrapper over the other quadrature libraries, which also adds AD support. GalacticOptim.jl does the same for OptimizationProblem. NonlinearSolve.jl is meant to do the same.

The integration with ModelingToolkit makes a lot of sense because that means you can use symbolic tooling to enhance the numerical speeds. https://github.com/JuliaComputing/StructuralTransformations.jl already has nonlinear tearing for MTK-specific NonlinearSystems, so if you go through MTK you can end up with a much simpler nonlinear solve. You also get automated parallelism, sparsity detection, analytical Jacobians, etc. There’s more that can be done there too, with BLT transformations and automation of program transformation to enforce some nonlinear constraints.

NonlinearSolve.jl started by implementing a bunch of scalar nonlinear problems. The reason is because… Roots.jl wasn’t worth wrapping due to its allocations even when scalar. But next up we’re going to have this library @requires wrap NLsolve, NLsolvers, SUNDIALS KINSOL, SIAMFANLEquations.jl, etc. and everything else we can find. Then we add AD support. Then we integrate it into tooling like SteadyStateDiffEq.jl.

The reason is because then you can just pass NLsolveAlg() into a bigger algorithm and it’ll know how to dispatch the internal nonlinear solver. This will make algorithms built on nonlinear solvers (the problem we run into) be a lot more flexible without the extra work on the user. So, the same kind of deal as we’ve done with differential equations (and optimization and quadrature).

But we’ll leave the implementation of nonlinear solvers to the ecosystem. That’s @pkofod and @ctkelley’s job, so we’ll make use of those. I don’t think we’ll get into the business of making special nonlinear solvers… I think .

3 Likes

I will be registering SIAMFANLEquations.jl soon. The only thing left to do is get the first part of Chapter 2 into the notebook . That part will have an example of how to use ptcsol.jl (the pseudo-transient continuation code) for the buckling beam problem.

ptcsol.jl and nsol.jl (nsold.m with some new things like mixed-precision solves) are ready to go. @briochemc, I’d be happy to assist you in your application of ptcsol.jl. If you are already using it, please tell me how it’s going.

1 Like

@ctkelley Sorry for my ad, I just note the reference of my package from yours.

@ChrisRackauckas One additional point is whether we want to support types that are not <: AsbtractArray

Yes, if the solvers do. And they already do: Number is not an AbstractArray.

@rveltz, no need to apologize. The more data we as a collective can give @briochemcm, the better off he will be.

1 Like

Yup, scalar equation solvers would get very upset if fed an array. @rveltz, do you have something in mind?

Things like ArrayFire, ApproxFun…

Thank you! After looking a bit more carefully at the code, I realize that there is no option for an out-of-place Jacobian function, so I need to spend a bit more time with the inner workings of AIBECS to return an in-place Jacobian function before I can try ptcsol (forming the Jacobian out-of-place was never the bottleneck for me so I just skipped trying to implement that until now).

Actually, I realize now that updating a sparse Jacobian inplace is not so straightforward… See, e.g. these many discussions around this topic:

@ctkelley, Are there any plans to allow for out-of-place (sparse) jacobian functions? I ask because I already have available a fast-but-out-of-place jacobian function Hi,

I think you forgot these alternatives:

• use of BandedArrays (can really be much faster than sparse and ~ easy to update)
• use of SparseDiffTools.jl which can be inplace
1 Like

What do you mean? You can’t cook an inplace one from your out-of place one?

First create a sparse matrix with “filled-locations” wherever the Jacobian can be non-zero. Then during your in-place update, fill those locations. If one of them is ==0, that’s ok, just set it to 0 (and make sure that the method you use for setting it, does not delete the “filled-location” spot).

But yes, consider SparseDiffTools.jl

1 Like

Once you’ve allocated the storage, ptcsol defaults to the usual sparse factorization from SparseSuite. I’ll have an example of this in a later chapter of the notebook.

For now I’m using special structrues. There’s an example for PTC and the buckling beam in

where the Jacobian is tridiagonal. There is also an Newton’s method example using BandedMatrices.jl. You need to know the sparsity pattern in advance so the solver does not allocate for the Jacobian. Let me think a bit about how hard it would be for me to use an allocating Jacobian evaluation function.

@rveltz was correct on both of his points. If you Jacobian is banded, using something like BandedMatrices.jl will make your life better and sparse differencing can figure out the sparsity pattern automatically.

I will tag the latest version of the notebook today and will register the package by the end of this week.

1 Like

Well I think I could, it’s just a bit more complicated than I anticipated because my out-of-place Jacobian function drops zeros (the non-structural entries that are sometimes zero). This is both good (because there are quite a few) and bad (because some branching inside the nonlinear function means some non-structural zeros appear and disappear and make it hard to use for an inplace version).

For the record and because it helps me think about, I’ll desribe my Jacobian function here and I might eventually copy this into the package docs anyway. Also maybe someone has good ideas to improve it and make it inplace, too, so here goes. This goes in the details so it is a little wordy — so apologies in advance.

TLDR: I have a function F that looks like F(x) = G(x) - T x where G is a perfect candidate for ForwardDiff and SparseDiffTools, while T is already available as the Jacobian of x \mapsto T x. How can I make an inplace Jacobian for J(x) = \nabla G(x) - T?

Long version:

When trying to solve for F(x) = 0, the variable x is actually a concatenation of smaller (but still large) vectors:

x = \begin{bmatrix} x_1\\ x_2\\ \vdots \\ x_N \\ \end{bmatrix}

where each x_i can be a ~200,000 sized column vector.

All the x_i have the same length and share the same indices, in the sense that these indices correspond to some linear indices of a 3D grid. Each different x_i thus actually represents a different 3D field that has been “vectorized”. The function F can be similarly split

F(x) = \begin{bmatrix} F_1(x_1, \ldots , x_N) \\ F_2(x_1, \ldots , x_N) \\ \vdots \\ F_N(x_1, \ldots , x_N) \\ \end{bmatrix}.

and each F_i is further split into a “nonloncal” linear part and a “local” nonlinear part, like so

F_i(x) = G_i(x_1, \ldots , x_N) - T_i x_i

where the T_i are sparse matrices but not diagonal, and their sparsity structure does not correspond to any of the sparse array packages I have seen (they have 1 to ~20 off-diagonals that can be quite far off the main diagonal and whose distance to the diagonal can vary with each row/column). The G_i, on the other hand, are nonlinear, but local, in that the j-th entry [G_i(x_1, \ldots , x_N)]_j only depends on the j-th entries of its x_i arguments. The AIBECS.jl user supplies the T_i and the G_i functions, which I then use to build F and its Jacobian as a function of a single larger vector x.

The jacobian of F can be built efficiently using autodifferentiation of the G_i only (ForwardDiff.jl in my case). If I denote by \partial_j G_i the “derivative” of G_i with respect to x_j to be the (square) jacobian of the x_j \mapsto G_i(x_1,\ldots,x_N) function, which is diagonal because the G_i are “local” as explained just above, then jacobian of F can be written as

J(x) = \begin{bmatrix} (\partial_1 G_1)(x)-T_1 & (\partial_2 G_1)(x) & \cdots & (\partial_N G_1)(x)\\ (\partial_1 G_2)(x) & (\partial_2 G_2)(x) -T_2 & \cdots & (\partial_N G_2)(x)\\ \vdots & & \ddots & \vdots \\ (\partial_1 G_N)(x) & (\partial_2 G_N)(x) & \cdots & (\partial_N G_N)(x) - T_N\\ \end{bmatrix}

In practice, I actually build it with

J(x) = \begin{bmatrix} (\partial_1 G_1)(x) & (\partial_2 G_1)(x) & \cdots & (\partial_N G_1)(x)\\ (\partial_1 G_2)(x) & (\partial_2 G_2)(x) & \cdots & (\partial_N G_2)(x)\\ \vdots & & \ddots & \vdots \\ (\partial_1 G_N)(x) & (\partial_2 G_N)(x) & \cdots & (\partial_N G_N)(x)\\ \end{bmatrix} - \begin{bmatrix} T_1 & & & \\ & T_2 & & \\ & & \ddots & \\ & & & T_N\\ \end{bmatrix}

where N ForwardDiff’d calls to the G_i's are sufficient to build the first term (which is essentially an array of diagonals) with some combination of sparse, Diagonal, hcat, and vcat operations, and where the second term uses the blockdiag function on the T_i. I believe the - operation drops all the non-structural zeros (but I need to double check), which is good and bad, as mentioned in the intro to this post.

I could use SparseDiffTools for the first term by supplying the sparsity pattern, but it is not efficient to use it on the whole F because the T_i fill in a lot of the sparsity pattern and I actually don’t need to use ForwardDiff on the T_i x_i terms since these are linear and the T_i are already provided.

So how should I go about making an inplace version of this Jacobian?

Using FD, you can update the non diagonal blocks inplace. So I would focus on updating the block (1,1) inplace. You save the indices of T_1 which are changed by \partial_1G_1 and write them inplace. This is what I do here

1 Like

@rveltz I just started thinking about all this again and noticed your link does not work anymore. I’m assuming the code was just moved somewhere else — could you update said link? 1 Like
1 Like