How to know the size of the mutated array in f!(F, x)?

Is there a way to know the size of F in

function f!(F, x)
    F[1] = (x[1] + 3) * (x[2]^3 - 7) + 18
    F[2] = sin(x[2] * exp(x[1]) - 1)
    F[3] = x[1] + x[2]
    return F
end

without knowing F in advance?

My is to program my own non-linear equation solver, so I have to be flexible to program it in a way that works for any size of F. For example, by executing customnlsolver(f!, x0) I need to know what size F has to be able to configure the Jacobian for AD with ForwardDiff.

does size(F) not work ?

For that I’d have to know F in advance. Starting from something like customnlsolver(f!, x0), I want to know size(F).

Where does F come from in customnlsolver(f!, x0)?

I’m trying to reproduce this behavior:

using NLsolve

nlsolve(f!, [1.0, 1.0, 1.0])

I looked at the code of nlsolve but I don’t understand how they do it.

I don’t think it “knows” the dimensions - if you take the example from the Readme with a function with two inputs, you get:

julia> nlsolve(f!, j!, [ 0.1; 1.2]) # This is the readme example
Results of Nonlinear Solver Algorithm
(...)
 * Zero: [-3.487552479724522e-16, 1.0000000000000002]
(...)

julia> nlsolve(f!, j!, [0.1 1.2 1.5]) # Passing three initial guesses
Results of Nonlinear Solver Algorithm
(...)
 * Zero: [-3.487552479724522e-16, 1.0000000000000002]
(...)

julia> nlsolve(f!, j!, [0.1]) # Passing a lenght one initial guess
ERROR: BoundsError: attempt to access 1-element Vector{Float64} at index [2]

In your nonlinear-equation solver, just infer the dimension of the problem from the size of the initial guess that the user passes to your solver (not from the function object).

3 Likes

but for systems in which the number of variables is different than the number of equations, how do I do that?

If the number of variables doesn’t match the number of equations then you aren’t doing root-finding, you are doing optimization (e.g. least-squares), and you will need an objective function that returns a scalar. In that case, you will be computing the gradient, not the Jacobian, and you don’t need to know the size of the internal state of your scalar objective function.

4 Likes

Initially I got confused by your wording, but now I see what you mean, that a way to proceed is to minimize the norm of the residuals of the system. Thank you.

In a system F: \mathbb{R}^n \rightarrow \mathbb{R}^n, what are the conditions under which it is better to try to solve F(x)=\vec{0} using root-finding algorithms vs solving the fixed point G(x) = x, with G(x) = F(x) + x?

Your question makes no sense, because solving a fixed-point equation is a root-finding problem.

Do you mean the specific algorithm of fixed-point iterations?

Yes, the fixed point of G corresponds to the zeros of F, no?

Yes, that is why I said they are equivalent problems.

Rephrasing the question (at the risk of putting my foot in my mouth again), what is a “better” approach (say in terms of accuracy, or speed of convergence), directly solving for the zero of F using a nonlinear solver, or iterating on G until an fixed point is found?

“what is a ‘better’ approach” depends on the characteristics of G. Fixed point iteration can only work if the system is stable, and convergence is only as fast as the stable system’s dynamics (e.g. eigenvalues of linearized G). That’s not an issue for nonlinear solvers, but most will work best if F has a smooth gradient or other known properties. I believe it would take extreme luck for iterations of G to outperform a good nonlinear solver of F, but only you could know.

3 Likes

“Iterating on G” with fixed-point iterations is a nonlinear solver. So your question still doesn’t make much literal sense. But I think you mean, “How does fixed-point iteration compare to other nonlinear-solver algorithms?”

Fixed-point iteration is one of the least reliable nonlinear-solver algorithms — it only works for contractive maps, and even then is relatively slowly converging compared to e.g. Newton’s method. The main advantages of a fixed-point iteration are that it is easy to implement (which is nontrivial — computer time is cheaper than programmer time), requires minimal storage (no extraneous vectors or matrices), and doesn’t require you to supply a Jacobian matrix.

I would use Newton’s method as my first choice if you have easy/cheap access to the Jacobian (e.g. via AD). Even without a Jacobian, there are a variety of Jacobian-free algorithms that are probably faster than fixed-point iteration, e.g. Anderson acceleration, the closely related Broyden’s method, or Newton–Krylov methods (which only require a directional derivative that can be computed by finite differences if needed). These algorithms are more complicated to implement, but that’s what we have libraries for.

1 Like