I am solving a nonlinear problem using JuMP and IPOPT. All my constraints are linear or degree 2 polynomials. I firstly try to add nonlinear constraints using @NLconstraint and here is the summary of IPOPT
This is Ipopt version 3.12, running with linear solver ma27.
Number of nonzeros in equality constraint Jacobian...: 554022
Number of nonzeros in inequality constraint Jacobian.: 285500
Number of nonzeros in Lagrangian Hessian.............: 1144062
Total number of variables............................: 117369
variables with only lower bounds: 0
variables with lower and upper bounds: 117369
variables with only upper bounds: 0
Total number of equality constraints.................: 109186
Total number of inequality constraints...............: 105502
inequality constraints with only lower bounds: 32284
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 73218
Then I also tried to use @constraint to add all nonlinear constraint (which is the only change I made) and IPOPT reports
This is Ipopt version 3.12, running with linear solver ma27.
Number of nonzeros in equality constraint Jacobian...: 988378
Number of nonzeros in inequality constraint Jacobian.: 434498
Number of nonzeros in Lagrangian Hessian.............: 785852
Total number of variables............................: 117369
variables with only lower bounds: 0
variables with lower and upper bounds: 117369
variables with only upper bounds: 0
Total number of equality constraints.................: 109186
Total number of inequality constraints...............: 105502
inequality constraints with only lower bounds: 32284
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 73218
The sparsity patterns of Jacobian and Hessian are very different. In terms of performance, the first one is extremely slow: the primal step size becomes very small (1e-3 or -4), the Hessian of the Lagrangian becomes ill-conditioned so that some nontrivial perturbation is needed for a lot iterations, and IPOPT reached 3000 iteration at the end. In contrast, the second one can be finished in around 300 iterations. Could anyone please explain why the two ways of adding constraints cause this difference?
Can you provide a minimal working example? (Your problem looks quite large, so can you simplify it while retaining the difference in sparsity pattern?)
using Ipopt, JuMP
model = Model(with_optimizer(Ipopt.Optimizer))
@variable(model, x[i in 1:10], lower_bound = 0)
@variable(model, y[i in 1:10], lower_bound = 0)
@NLconstraint(model, [i in 1: 9], x[i]*y[i] == x[i+1]*y[i+1])
@objective(model, Min, sum(x) + sum(y))
optimize!(model)
with output
This is Ipopt version 3.12, running with linear solver ma27.
Number of nonzeros in equality constraint Jacobian...: 36
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 54
If constraints are added by @constraint:
using Ipopt, JuMP
model = Model(with_optimizer(Ipopt.Optimizer))
@variable(model, x[i in 1:10], lower_bound = 0)
@variable(model, y[i in 1:10], lower_bound = 0)
@constraint(model, [i in 1: 9], x[i]*y[i] == x[i+1]*y[i+1])
@objective(model, Min, sum(x) + sum(y))
optimize!(model)
then IPOPT gives
Number of nonzeros in equality constraint Jacobian...: 36
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 18
The number of non-zeros Lagrangian Hessian are different. Difference is 36, and each constraint misses 4 in @constraint.
A difference in the number of nonzeros is expected because of how @NLconstraint processes expressions. They go through very different code pathways. If you know the constraint is quadratic or linear, you should usually use @constraint instead. However, I would expect the only effect of this to be a difference in derivative evaluation time. I cannot explain why the number of iterations of Ipopt would differ; that sounds like a bug somewhere.
Try running Ipopt with the derivative_test = "second-order" option to validate that the derivatives are correct.
I see. So maybe QuadExpr() will be helpful for JuMP to better identify quadratic pattern?
I never used gradient checker in Julia. I did once in C++, where I needed to provide gradient and Hessian manually. I remembered there was one constraint that really puzzled me for a while: my provided value for some component is 1, but the gradient checker reports 0. It is possible that I was wrong but otherwise I am not sure how accurate it is…Also when the gradient checker is turned on, is it true IPOPT will always use the gradient and hessian calculated by itself (by some auto-differentiation)?
I didn’t observe too much difference in IPOPT iterations for other examples but the one in the original post, which is also the largest one I have tested. I can play around with it more or re-produce the difference for smaller problems. Thanks for the advise.