I have an f: \mathbb{R}^n \to \mathbb{R}^m function with m > n. I am looking for a root f(x) \approx 0. f is continuous, differentiable, generally rather nice, if somewhat nonlinear.
I managed to make this work with NLsolve.nlsolve
, by picking some of the results, eg using something like (stylized code)
n = ...
mask = [true, false, ...]
@assert sum(mask) == n
nlsolve(x -> f(x)[mask], zeros(n); autodiff = true)
and theory tells me that
all(f(x)[mask] .== 0)
implies that
all(f(x)[.!mask] .== 0)
(specifically, things like Walras’s law, this is a general equilibrium model). However, because of the nonlinearity, it can happen that I can be within tolerance with f(x)[mask]
, and way outside tolerance for other values.
I can experiment with various mask
s and tolerances, but I am wondering if there is a systematic approach to this. Sorry, no MWE, the actual problem is about 4000 LOC, it is a small miracle I made AD work at all.
This sort of problem comes up quite a lot in symmetric dynamical systems. The nicest ways of dealing with it involve factoring out the invariance from the equations (effectively reducing m such that n=m) or adding artificial variables that break the invariance (effectively increasing n such that n=m; these variables should get set to zero in the root finding). Both of these approaches requires quite a bit of knowledge to make them work and so might not be appropriate for you.
The simplest way to deal with this (though somewhat inelegant) is to replace the linear solve within the nonlinear root finder with a least-squares solution. The last time I had to do this was in Matlab where the backslash operator is conveniently overloaded to do least squares or a linear solve (depending on the matrix size) and so it usually happens automatically. I’ve not had to do this with NLsolve but I think you can change the linear solver used in the more recent releases.
1 Like
I always appreciate being politely informed about my lack of knowledge But links to resources or examples would be welcome so that I could remedy this.
Unfortunately, this goes to wacky local minima about 20% of the time.
I am wondering whether there is a trust-region quasi-variant of Gauss-Newton [EDIT: maybe Levenberg–Marquardt?]
If you’re willing to try least squares, I’d be glad if you tried JSOSolvers.trunkls
and/or CaNNOLeS
. Both use NLPModels.AbstractNLSModel
to create the problem, e.g.
using CaNNOLeS, NLPModels
# Rosenbrock
nls = ADNLSModel(x -> [x[1] - 1; 10 * (x[2] - x[1]^2)], [-1.2; 1.0], 2)
stats = cannoles(nls)
2 Likes
Apologies! Not quite the tone I was going for - I was more meaning that the elegant methods are more work than is usually necessary/desireable.
On my phone at the moment - will try to dig out some refs when I next open my laptop. I’ve never used Levenberg-Marquardt myself but I do know people who swear by it for this sort of this.
2 Likes
Did you try lmfit
from LsqFit? It’s not too easy to find I guess, and for that reason I expect to include an LM-implementation based on this in “new Optim”. I originally moved it out of Optim actually, which was maybe a mistake in hindsight. It’s quite high on my list now that most of the straight optimization stuff is in place. Do you need published/tagged/registered library code, or are you willing to experiment a bit? :]
1 Like
I am perfectly willing to experiment, thanks. First I need to think about my objective to make it less nonlinear, then I will try all of these and report back.
I can’t find the exact paper I was thinking of but a starting point is Claudia Wulff and Frank Schilder. Numerical bifurcation of Hamiltonian relative periodic orbits.
SIAM J. Appl. Dyn. Syst., 8(3), pp. 931-966 (2009). (I’m not sure how accessible it is to people outside the field of Hamiltonian dynamics - certainly reading it now is a lot harder than I remember it being at the time…)
Another example is Peeters, M., Viguié, R., Sérandour, G., Kerschen, G., & Golinval, J. C. (2009). Nonlinear normal modes, Part II: Toward a practical computation using numerical continuation techniques. Mechanical systems and signal processing, 23(1), 195-216. They construct an overdetermined system and use the Moore-Penrose pseudo-inverse in conjunction with a Gauss-Newton method (sec 3.2.2).
Related to the Peeters paper, the simplest example of adding extra variables I can think of is the computation of periodic orbits of the differential equation
\ddot x + x + x^3 = 0
This equation has a family of periodic orbits that live on a manifold. To find them the typical approach is to add a phase condition (to fix time invariance) and the pseudo-arclength equation (to allow path following), which results in f:\mathbb{R}^n\to\mathbb{R}^{n+1}. To solve this, either follow the paper and use Moore-Penrose with Newton or, alternatively, add the extra variable a\dot x to the differential equation, where a is to be solved for. This gives a new problem g:\mathbb{R}^{n+1}\to\mathbb{R}^{n+1} which can be solved with standard techniques. The key to this is the fact that for a\neq0 the manifold does not exist and so the root finder will automatically choose a=0. The term a\dot x is known as an unfolding term; in this case it works because the manifold is a result of a conservation law in the underlying equations (in this case conservation of energy) and the unfolding term destroys the conservation law by adding dissipation. If your equations have some form of conservation law embedded within them, you might be able to add suitable unfolding terms but it isn’t always obvious how to.
1 Like
The extra variable method turns out to work great. You really fixed a major issue in our project which had us stuck for a week. Thank you so much!
If anyone is interested in details, I have N labor markets and K goods markets, and parametrize in labor prices w_n. I fix w_1, the rest of the N-1 wages are free parameters, along with a.
- wages allow me to calculate labor supply and consumption demand,
- some first order conditions allow calculation of labor demand from
output = consumption .+ a
- I solve for
labor_supply .- labor_demand
being 0 using nlsolve
(forward mode AD, trust region) and then verify isapprox(a, 0)
.
This seems to work 100% of the time in out unit tests (while the naive method I was using before failed about 70% of the time (for random parameters)).
2 Likes