Ipopt's "local" optimum can be very substandard

Continuing the discussion from The solution deteriorates if the domain of decisions is improperly set when using Ipopt:

The title is expounded with 3 examples, where

  1. the result of the first one is completely unacceptable
  2. the result of the second one is heavily substandard.
  3. the result of the third one is no-problem, but suggesting that Ipopt may NOT converges to the nearest local optimum from the warm-start point. (Then the importance of the warm-start point is impaired?)
import JuMP, Ipopt
# 1ļøāƒ£
NLP = JuMP.Model(Ipopt.Optimizer) # continuous nonlinear program
JuMP.@variable(NLP, -Float64(2 * pi) <= x <= Float64(2 * pi), start = 0.0)
JuMP.@objective(NLP, Min, cos(x))
JuMP.optimize!(NLP)
JuMP.termination_status(NLP), JuMP.primal_status(NLP) # (MathOptInterface.LOCALLY_SOLVED, MathOptInterface.FEASIBLE_POINT)
x = JuMP.value(x) # 0.0
cos(x) # 1.0
# 2ļøāƒ£
NLP = JuMP.Model(Ipopt.Optimizer)
JuMP.@variable(NLP, -10.0 <= x <= 14.6, start = 0.0)
JuMP.@objective(NLP, Min, cos(x))
JuMP.optimize!(NLP)
JuMP.termination_status(NLP), JuMP.primal_status(NLP) # (MathOptInterface.LOCALLY_SOLVED, MathOptInterface.FEASIBLE_POINT)
x = JuMP.value(x) # 14.600000143196766
cos(x) # -0.44648501954346353
# 3ļøāƒ£
NLP = JuMP.Model(Ipopt.Optimizer)
JuMP.@variable(NLP, 0.0 <= x <= 15.0, start = 0.0)
JuMP.@objective(NLP, Min, cos(x))
JuMP.optimize!(NLP)
JuMP.termination_status(NLP), JuMP.primal_status(NLP) # (MathOptInterface.LOCALLY_SOLVED, MathOptInterface.FEASIBLE_POINT)
x = JuMP.value(x) # 9.424777960585782
cos(x) # -1.0
1 Like

These are all expected behavior. Ipopt is a local optimizer. It does not find global minima, but stationary points. Ipopt will only converge to an optimum if the problem is convex and twice differentiable.

In example one, your starting point with cos(x) is a stationary point.

For the second example, I assume that as a heuristic, Ipopt additionally evaluates the objective at the variable bounds, and it found a better solution so it returned that. Evaluating the variable bounds in the first example did not improve the objective.

For the third example, The starting point is at the edge of the feasible region. Ipopt perturbs the solution into the interior, which lets it escape the local maximum.

2 Likes

I see, but
The cos function is a nice function in that it belongs to C^\infty (having arbitrary orders of continuous derivative). In the first example, Ipopt gives me a local Maximum which cannot survive any slight perturbation. In this case, what is the meaning of specifying Min in JuMP.@objective(NLP, Min, cos(x))?

I donā€™t think Ipopt can memorize the best solution found so far.
Letā€™s discuss it separately here.

In the third example, my meaning was:
Ipopt returns 3 * pi, which is correct. But clearly pi is the nearest optimality, from the start value 0.0.

One more question: is the usage in my first post correct (as expected)?
Do I have to explicitly pass gradient and Hessian info to Ipopt? or something else?

Letā€™s see 2 more examples where the start value is regular.

import JuMP, Ipopt
# 4ļøāƒ£
NLP = JuMP.Model(Ipopt.Optimizer)
JuMP.@variable(NLP, 0.0 <= x <= 15.0, start = 0.15)
JuMP.@objective(NLP, Min, cos(x))
JuMP.optimize!(NLP)
JuMP.termination_status(NLP), JuMP.primal_status(NLP) # (MathOptInterface.LOCALLY_SOLVED, MathOptInterface.FEASIBLE_POINT)
x = JuMP.value(x) # 9.424777960585782
cos(x) # -1.0
# 5ļøāƒ£
NLP = JuMP.Model(Ipopt.Optimizer)
JuMP.@variable(NLP, 0.0 <= x <= 15.0, start = 0.18)
JuMP.@objective(NLP, Min, cos(x))
JuMP.optimize!(NLP)
JuMP.termination_status(NLP), JuMP.primal_status(NLP) # (MathOptInterface.LOCALLY_SOLVED, MathOptInterface.FEASIBLE_POINT)
x = JuMP.value(x) # 15.000000146136275
cos(x) # -0.7596880078894556

In the 4th example, the expected local minimum is pi = 3.14, because this is a closer minimum.
In the 5th example, the same applies. But the returned solution only achieves -0.759 (substandard, and it is not reasonably poor).

I have no other comments. This is expected behavior of Ipopt. If you violate the assumption that the problem is convex, there is no guarantee what stationary it will find.

Do I have to explicitly pass gradient and Hessian info to Ipopt? or something else?

Nope. JuMP uses automatic differentiation to compute the various gradients and Hessian oracles.

1 Like

Talking about ā€œcloser minimumā€ is not relevant here. IPOPT builds a quadratic local model of the original problem at every iteration, and moving from one iterate to the next depends on the shape of this local model. The local model ressembles the original problem only when weā€™re close to the current point.

2 Likes

Thanks, I take your point.

So the downside is that the algorithm of Ipopt under the default setting is not versatile.
But the upside includes:

  1. at least it will not generate a solution worse than the start point (?)
  2. at least it is fast, since it do not need to guarantee global optimality (?)

Iā€™m not very acquainted with Ipopt. Iā€™ll read its doc then.

PS do you think your software (UNO) can do better in these examples?

IPOPT is generally quite robust, but probably not as robust as an SQP method. For the first instance, the issue is that IPOPT does not follow feasible directions of negative curvature, but regularizes the problem instead. Therefore, it wonā€™t notice if you start at a local maximum because the optimality conditions are satisfied.

To answer your questions:

  1. yes, if the problem is unconstrained, IPOPT returns a point that is not worse than the starting point. If the problem is constrained, it depends whether the starting point is feasible and what the numerical tolerance is.
  2. indeed, IPOPT doesnā€™t guarantee global optimality in the nonconvex case, only local optimality. If you start far from a local solution, progress may be slow until you get to a ā€œbasin of attractionā€ where the Newton method converges quickly. Most methods are equipped with globalization techniques that ensure that, whatever the initial point, you reach a stationary point.

The interior-point method (IPM) implemented in Uno closely follows the IPOPT algorithm (with fewer bells and whistles), so its performance is in the same ballpark as IPOPT (a tad less robust). On your 3 instances, the Uno IPM produces exactly the same results. Feel free to try it out:

using JuMP, AmplNLWriter, Uno_jll

function Optimizer()
   options = String["print_solution=yes", "preset=ipopt"]
   return AmplNLWriter.Optimizer(Uno_jll.amplexe, options)
end
       
model = Model(() -> Optimizer())
@variable(model, -Float64(2 * pi) <= x <= Float64(2 * pi), start = 0.0)
@objective(model, Min, cos(x))
JuMP.optimize!(model)

That said, I have a few ideas on how to improve upon IPOPT. Iā€™m hoping Iā€™ll get to try them out this year :slight_smile:

2 Likes

In this case I may go with Optim.jl instead.

To conclude, I think NLP is a problem-specific problem. Every algorithm has its own field to apply properly.ā€œThe closestā€ minimum does indeed shouldnā€™t be expected (e.g. considering minimizing cos(1e6 * pi * x)) . Itā€™s hard for a black-box solver to infer how long of a stepsize is proper.