`Initial guess is not an interior point` in optimization methods

I’m using Optim.jl via Optimization.jl and hitting Initial guess is not an interior point. I get that I could/should change my initial condition, but I’d like a general solution that works in general whenever a solution exists (I’m exposing the functionality to not-so-technical users). It made me wonder why Optim.jl / Optimization.jl does not simply start by searching for an interior point, instead of complaining? In a complex non-convex problem this might be difficult, but a lot of optimization problems aren’t.

Is there any function I can call to easily find an interior point (any point that satisfies the constraints?)

I’m not completely sure, but I think they use an infeasible primal-dual method: they transform general inequality constraints into equality constraints by adding slacks s, and move the bounds on the primals x and slacks s into barrier terms.

Therefore an interior initial guess must satisfy strict bound constraints on the x (and s) variables (and strict bound constraints on the bound dual variables if they are provided). So that’s too difficult to obtain.
Edit: solvers like IPOPT automatically push the initial guess into the interior of the bound constraints.

Methods like the interior point Newton do require - even without fancy primal-dual - an interior point, since they “keep away from the boundary” in their steps. Or in other words: In the outside, their barrier part would evaluate to infinity.

There is not much you can do about this, besides switching to methods that do not require this.
In principle you could start with something like an Augmented Lagrangian and wait for it to be inside - which might not happen if your solution lies on the boundary.

(I guess, but probably is the truth): because this procedure alone admits of no efficient solution method, so they simply don’t implement. As you mentioned, the context is non-convex.

Gurobi can solve nonlinear nonconvex optimization (of course including help you find a feasible point), but the algorithm it uses is not guaranteed to be efficient either.

You could just minimize the constraint violation, e.g. if you have a constraint c(x) \le 0, you could minimize c(x), stopping as soon as you reach a point < 0. If you have multiple constraints c_k(x) \le 0 you could minimize the epigraph formulation (which is a way to transform the “minimax” optimization of the biggest constraint violation into a differentiable NLP):

\begin{array}{c} \displaystyle{\min_{t\in \mathbb{R}, x} t} \\ \text{s.t. } t \ge c_k(x) \end{array} \Longleftrightarrow \min_x \left[ \max_k c_k(x) \right]

(This can easily be initialized to a feasible starting point t_0 = \max_k c_k(x_0).)

(The CCSA/MMA algorithms in NLopt.jl do something of this flavor IIRC.)

Of course, this is not guaranteed to find a feasible point, but that’s true in general of non-convex optimization: finding a point in the feasible set might be arbitrarily hard. This is the fundamental reason why algorithms like to assume that you start with a feasible point — they can only guarantee convergence to a local optimum if they start at a feasible point.

5 Likes

Nonconvex optimization modeling come in two flavors: Nonlinear programming (NLP) and mixed-integer programming (MILP). I learnt from Bertsekas’s book that (quote) “NLP is easier than MILP formulation”, said him.

But I think there is one commendable advantage pertaining to MILP—you can get a valid dual bound. (e.g. Gurobi can provide a valid dual bound whereas Ipopt cannot).

So in this situation

If you are using a local optimizer, say Ipopt, to do this minimization. Then you specify an initial starting point, and wait for some time for a local solution. If the resulting obj is > 0, then this trial is rendered meaningless and disposable—you have to re-select another starting point, and re-try (which could also be meaningless in the same way).

However, if you are using Gurobi to do this minimization, you don’t need to spend time selecting starting points. And more importantly, you can be aware of both primal and dual side valid information. So the stopping criterion is specified by a OR condition—stop either

  • a solution with primal obj < 0 is found

or

  • the dual bound becomes positive

. The former indicates that a feasible point is assured, whereas the latter gives a deterministic certificate of infeasibility.


But to sum up, the most meaningful takeaway from this discussion is that:

  • Optimization Problems and Feasibility Problems, to some extent, are the same thing.

You can solve an optimization problem to arbitrary precision with a Feasibility Solver. (The reverse direction is trivial, thus omit.)

Yes, global optimizers provide more information than local optimizers, albeit at a big price in complexity. But this is true of NLPs too.

If you’re willing to solve NP-hard problems, you could also throw a global optimizer at a continuous NLP to either find the feasible set (and, for that matter, the approximate global optimum) or bound it as being smaller than any given diameter. And global optimizers could also be used to obtain approximate dual bounds.

Thank you for the answers everyone. I appreciate that this is not possible in general of course, but for convex problems then one is guaranteed to find a solution if it exists, I believe?

Sounds like it should work.

Feels like it’s something that Optimization.jl could at least attempt, instead of throwing an error. Or implement a try_satisfaction=true kwarg.

Use a different solver? Why do you need Optim? Ipopt will work.

2 Likes

Convex problems admit of efficient algorithms working towards the global solution. In reality, convex problems are formulated using convex cones, then you can solve them efficiently via e.g. Mosek. So it’s true that the solver can either provide a feasible solution or prove infeasibility within a time period that human can tolerate given your problem scale.

The expectation is reasonable so you can just open an issue to their repository and see what you get. But personally I think this (feasibility-preprocessing) procedure is not easy to implement in a generic sense. Just like, say, in my MILP decomposition field there are some packages e.g. SDDP.jl, Coluna.jl that can provide generic service. But I still tend to implement the algorithm myself so I can exploit the specific problem structure better. And this is always true in mathematical programming—there is generally no generic foolproof algorithms. You always need to “tune” your algorithm somehow to achieve maximum performance.

Your description is a bit vague.

I don’t think Ipopt can give a INFEASIBLE certificate on nonconvex problems. (I think it can only give locally_infeasible or a local solution).

For convex problems (say, LP or QP), I’m not sure if Ipopt can give INFEASIBLE certificate. Do you have any idea about this? I don’t use Ipopt that much compared to Gurobi. But I think that Gurobi deals with large-scale problems better. I’ve just tried to use Ipopt to solve a large scale LP, which gives me

This is Ipopt version 3.14.19, running with linear solver MUMPS 5.8.1.

Number of nonzeros in equality constraint Jacobian...:  8389969
Number of nonzeros in inequality constraint Jacobian.: 13847808
Number of nonzeros in Lagrangian Hessian.............:        0

MUMPS returned INFO(1) =-13 - out of memory when trying to allocate 94107 MB.
In some cases it helps to decrease the value of the option "mumps_mem_percent".
Total number of variables............................:  7145090
                     variables with only lower bounds:   582912
                variables with lower and upper bounds:  5299201
                     variables with only upper bounds:        1
Total number of equality constraints.................:  1554432
Total number of inequality constraints...............:  3596858
        inequality constraints with only lower bounds:   487994
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:  3108864

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  0.0000000e+00 1.10e+02 1.00e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
MUMPS returned INFO(1) =-13 - out of memory when trying to allocate 94107 MB.
In some cases it helps to decrease the value of the option "mumps_mem_percent".
WARNING: Problem in step computation; switching to emergency mode.
   1r 0.0000000e+00 1.10e+02 9.99e+02   2.0 0.00e+00    -  0.00e+00 0.00e+00R  1
MUMPS returned INFO(1) =-13 - out of memory when trying to allocate 94107 MB.
In some cases it helps to decrease the value of the option "mumps_mem_percent".
WARNING: Problem in step computation; switching to emergency mode.
Cannot call restoration phase at point that is almost feasible for the restoration NLP (violation 0.000000e+00).
Abort in line search due to no other fall back.
Step computation in the restoration phase failed.

Number of Iterations....: 1

                                   (scaled)                 (unscaled)
Objective...............:   0.0000000000000000e+00    0.0000000000000000e+00
Dual infeasibility......:   1.0000000000000000e+00    1.0000000000000000e+00
Constraint violation....:   1.1016500053135002e+02    1.1016500053135002e+02
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   3.1777970010000002e+05    3.1777970010000002e+05
Overall NLP error.......:   3.1777970010000002e+05    3.1777970010000002e+05


Number of objective function evaluations             = 2
Number of objective gradient evaluations             = 2
Number of equality constraint evaluations            = 2
Number of inequality constraint evaluations          = 2
Number of equality constraint Jacobian evaluations   = 1
Number of inequality constraint Jacobian evaluations = 1
Number of Lagrangian Hessian evaluations             = 2
Total seconds in IPOPT                               = 114.731

EXIT: Restoration Failed!

julia> JuMP.solution_summary(cen)
solution_summary(; result = 1, verbose = false)
├ solver_name          : Ipopt
├ Termination
│ ├ termination_status : OTHER_ERROR
│ ├ result_count       : 1
│ └ raw_status         : Restoration_Failed
├ Solution (result = 1)
│ ├ primal_status        : INFEASIBLE_POINT
│ ├ dual_status          : UNKNOWN_RESULT_STATUS
│ ├ objective_value      : 0.00000e+00
│ └ dual_objective_value : 0.00000e+00
└ Work counters
  ├ solve_time (sec)   : 1.17831e+02
  └ barrier_iterations : 1

Seems that Gurobi performs better on this large-scale problem.

Maybe others have some comments from the theoretical standpoint. But I’m speaking here based on the experience in practice.

I’ve just spent 10 minutes playing with Ipopt (letting it solve a medium-scaled LP feasibility problem). I’m a bit curious about the algorithm it uses: why it just doesn’t terminate??

julia> cen
A JuMP Model
├ solver: Ipopt
├ objective_sense: FEASIBILITY_SENSE
├ num_variables: 492145
├ num_constraints: 1112091
│ ├ JuMP.AffExpr in MOI.EqualTo{Float64}: 97152
│ ├ JuMP.AffExpr in MOI.GreaterThan{Float64}: 30650
│ ├ JuMP.AffExpr in MOI.LessThan{Float64}: 194304
│ ├ JuMP.VariableRef in MOI.GreaterThan{Float64}: 413208
│ └ JuMP.VariableRef in MOI.LessThan{Float64}: 376777
└ Names registered in the model
  └ :pA_scalar

julia> JuMP.optimize!(cen)
This is Ipopt version 3.14.19, running with linear solver MUMPS 5.8.1.

Number of nonzeros in equality constraint Jacobian...:   526327
Number of nonzeros in inequality constraint Jacobian.:   878664
Number of nonzeros in Lagrangian Hessian.............:        0

Total number of variables............................:   449639
                     variables with only lower bounds:    36432
                variables with lower and upper bounds:   334270
                     variables with only upper bounds:        1
Total number of equality constraints.................:    97152
Total number of inequality constraints...............:   224954
        inequality constraints with only lower bounds:    30650
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:   194304

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  0.0000000e+00 1.10e+02 3.34e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  0.0000000e+00 1.10e+02 3.33e+00  -1.0 7.38e+02    -  4.23e-04 5.48e-04f  1
   2  0.0000000e+00 1.10e+02 3.33e+00  -1.0 1.23e+03    -  4.76e-04 8.02e-04f  1
   3  0.0000000e+00 1.10e+02 3.33e+00  -1.0 1.95e+03    -  1.21e-03 6.82e-04f  1
   4  0.0000000e+00 1.10e+02 3.33e+00  -1.0 3.00e+03    -  1.50e-03 1.28e-03f  1
   5  0.0000000e+00 1.10e+02 3.53e+00  -1.0 4.46e+03    -  8.93e-04 1.93e-03f  1
   6  0.0000000e+00 1.09e+02 8.70e+00  -1.0 5.62e+03    -  1.52e-03 3.57e-03f  1
   7  0.0000000e+00 1.09e+02 1.18e+01  -1.0 6.98e+03    -  2.56e-03 5.91e-03f  1
   8  0.0000000e+00 1.07e+02 1.52e+01  -1.0 8.24e+03    -  3.25e-03 1.00e-02f  1
   9  0.0000000e+00 1.06e+02 1.81e+01  -1.0 9.03e+03    -  5.21e-03 1.73e-02f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  0.0000000e+00 1.05e+02 1.83e+01  -1.0 9.52e+03    -  2.22e-03 5.06e-03f  1
  11  0.0000000e+00 1.04e+02 1.86e+01  -1.0 9.62e+03    -  4.23e-03 1.20e-02f  1
..........
 499r 0.0000000e+00 4.17e+00 7.39e+02  -0.1 1.83e+02    -  4.27e-03 4.63e-03f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
 500r 0.0000000e+00 4.17e+00 7.35e+02  -0.1 1.93e+02    -  6.08e-03 5.08e-03f  1
 501r 0.0000000e+00 4.17e+00 7.31e+02  -0.1 2.20e+02    -  5.64e-03 4.15e-03f  1
 502r 0.0000000e+00 4.17e+00 7.27e+02  -0.1 3.08e+02    -  4.23e-03 7.87e-03f  1
 503r 0.0000000e+00 4.17e+00 7.23e+02  -0.1 3.67e+02    -  5.11e-03 5.17e-03f  1
 504r 0.0000000e+00 4.17e+00 7.21e+02  -0.1 3.63e+02    -  3.32e-03 3.35e-03f  1
 505r 0.0000000e+00 4.17e+00 7.17e+02  -0.1 3.06e+02    -  3.02e-03 6.61e-03f  1
 506r 0.0000000e+00 4.17e+00 7.13e+02  -0.1 2.37e+02    -  1.09e-02 3.93e-03f  1
 507r 0.0000000e+00 4.17e+00 7.09e+02  -0.1 4.22e+02    -  5.28e-03 5.52e-03f  1
 508r 0.0000000e+00 4.17e+00 7.04e+02  -0.1 4.39e+02    -  7.07e-03 6.57e-03f  1
 509r 0.0000000e+00 4.17e+00 7.00e+02  -0.1 2.29e+02    -  2.10e-03 5.57e-03f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
 510r 0.0000000e+00 4.17e+00 6.97e+02  -0.1 2.04e+02    -  4.55e-03 5.35e-03f  1
^C 511r 0.0000000e+00 4.17e+00 6.94e+02  -0.1 1.87e+02    -  8.92e-03 2.90e-03f  1

Number of Iterations....: 511

                                   (scaled)                 (unscaled)
Objective...............:   0.0000000000000000e+00    0.0000000000000000e+00
Dual infeasibility......:   6.9422267608956781e+02    6.9422267608956781e+02
Constraint violation....:   4.1686091819226494e+00    4.1686091819226494e+00
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   1.1883859412602860e+00    1.1883859412602860e+00
Overall NLP error.......:   3.4312950505403933e+02    6.9422267608956781e+02


Number of objective function evaluations             = 518
Number of objective gradient evaluations             = 198
Number of equality constraint evaluations            = 518
Number of inequality constraint evaluations          = 518
Number of equality constraint Jacobian evaluations   = 1
Number of inequality constraint Jacobian evaluations = 1
Number of Lagrangian Hessian evaluations             = 2
Total seconds in IPOPT                               = 664.573

EXIT: Stopping optimization at current point as requested by user.

julia> JuMP.solution_summary(cen)
solution_summary(; result = 1, verbose = false)
├ solver_name          : Ipopt
├ Termination
│ ├ termination_status : INTERRUPTED
│ ├ result_count       : 1
│ └ raw_status         : User_Requested_Stop
├ Solution (result = 1)
│ ├ primal_status        : INFEASIBLE_POINT
│ ├ dual_status          : UNKNOWN_RESULT_STATUS
│ ├ objective_value      : 0.00000e+00
│ └ dual_objective_value : -6.03182e+07
└ Work counters
  ├ solve_time (sec)   : 6.64649e+02
  └ barrier_iterations : 510

Notice that there is no warning in this logging.

So the resulting speculation here is: I think the philosophy of Ipopt is

  • We just try using our established algorithm to solve your problem. But we do not guarantee providing any fruitful outcomes in a reasonable time period.

Edit: Appears that the only advantage of using Ipopt is that you don’t need to input an starting point manually. But it cannot give you a definite infeasibility certificate. So if it succeed, then you get what you want. Otherwise it doesn’t yield any usable info.