How to properly specify Jacobian & Hessian functions of inequality constraints (Optim)

I’m trying to optimize an objective function with 19 variables in Optim with the following inequality constraints:

0 <= x[1]/3 - x[2] <= 1/3 
5 <= 1/x[3] + 1/x[4] <= 6

I’m trying to use either IPNewton() or NewtonTrustRegion, so I need to supply both a Jacobian and Hessian for the constraints. My question is: what is the correct way to write the Jacobian and Hessian functions?

I believe the constraint function would be

function con_c!(c,x)
   c[1] = x[1]/3 - x[2]          
   c[2] = 1/x[3] + 1/x[4]
   c
end

Would the Jacobian function be

function con_jacobian!(J,x)
   #first constraint:
   J[1,1] = 1/3
   J[1,2] = -1.0
  #second constraint:
   J[2,3] = -1/(x[3])^2
   J[2,4] = -1/(x[4])^2
   J
end

? (I assume all other indices of J are automatically set to zero?)

My main question: What would the Hessian function be? This is where I’m most confused. My understanding was that we take Hessians of scalar-valued functions. So do we have to enter multiple Hessians, one for each constraint function (2 in my case)?

I’ve looked at the multiple constraints example given here https://github.com/JuliaNLSolvers/ConstrainedOptim.jl, but I’m still confused. In the example, it looks like they are adding together two Hessian matrices…? Would greatly appreciate some help.

Just a note that I’ve posted this question on Stack Overflow because I haven’t received any replies here after 2 days.

Here are the docs: https://julianlsolvers.github.io/Optim.jl/stable/#examples/generated/ipnewton_basics/#multiple-constraints

I guess you want something like this, assuming I haven’t made a mistake.

using Optim

f(x) = sum(x)

function f_grad!(g, x)
    fill!(g, 1.0)
    return g
end

function f_hess!(h, x)
    fill!(h, 0.0)
    return h
end

function c(y, x)
    y[1] = x[1] / 3 - x[2]
    y[2] = 1 / x[3] + 1 / x[4]
    return y
end

function c_jac!(J, x)
    fill!(J, 0.0)
    # c₁'(x) = [1 / 3, -1, 0, 0]
    J[1, 1] = 1 / 3
    J[1, 2] = -1
    # c₂'(x) = [0, 0, -1 / x[3]^2, -1 / x[4]^2]
    J[2, 3] = -1 / x[3]^2
    J[2, 4] = -1 / x[4]^2
    return J
end

function c_hess!(H, x, λ)
    # c₁''(x) = 0
    # c₂''(x) = Diag([0, 0, 2 / x[3]^3, 2 / x[4]^2])
    H[3, 3] += λ[2] * 2 / x[3]^3
    H[4, 4] += λ[2] * 2 / x[4]^3
    return H
end

x0 = [0.5, 0.1, 1.2, 0.2]
df = TwiceDifferentiable(f, f_grad!, f_hess!, x0)
dfc = TwiceDifferentiableConstraints(
    c, 
    c_jac!, 
    c_hess!, 
    [0.0, 0.0, 0.0, 0.0], 
    Float64[], 
    [0.0, 5.0], 
    [1 / 3, 6.0],
)
optimize(df, dfc, x0, IPNewton())

You could also write this in JuMP (Introduction · JuMP), which may be easier:

using JuMP, Ipopt
model = Model(Ipopt.Optimizer)
@variable(model, x[1:4] >= 0.00001)
@NLconstraint(model, 0 <= x[1] / 3 - x[2] <= 1 / 3)
@NLconstraint(model, 5 <= 1 / x[3] + 1 / x[4] <= 6)
@objective(model, Min, sum(x))
1 Like

Thanks for your reply! I used your Hessian function code and the algorithm worked! Out of curiousity, I tried replacing IPNewton() with NewtonTrustRegion(), just to see how the results would compare. But I got this weird error:

StackOverflowError:

Stacktrace:
 [1] optimize(::TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::TwiceDifferentiableConstraints{var"#con_c!#89",var"#con_jacobian!#90",var"#con_hess!#91",Float64}, ::Array{Float64,1}, ::NewtonTrustRegion{Float64}, ::Optim.Options{Float64,Nothing}; inplace::Bool, autodiff::Symbol) at C:\Users\Michael\.julia\packages\Optim\CK6Dn\src\multivariate\optimize\interface.jl:148
 [2] optimize(::TwiceDifferentiable{Float64,Array{Float64,1},Array{Float64,2},Array{Float64,1}}, ::TwiceDifferentiableConstraints{var"#con_c!#89",var"#con_jacobian!#90",var"#con_hess!#91",Float64}, ::Array{Float64,1}, ::NewtonTrustRegion{Float64}, ::Optim.Options{Float64,Nothing}) at C:\Users\Michael\.julia\packages\Optim\CK6Dn\src\multivariate\optimize\interface.jl:147
 ... (the last 2 lines are repeated 28980 more times)
 [57963] ModelFit(; ODE_model::Function, ODE_vars::Array{String,1}, vars_to_fit::Array{String,1}, paramIC_dict::Dict{Symbol,Any}, data_dict::Dict{Symbol,String}, f_calc_ICs::typeof(f_ICs), N0::Int64, fit_range::Array{String,1}, date_obs_last::String, forecast_until::String, date_format::String, norm::typeof(mean_abs_rel_error), norm_weights::Array{Int64,1}, norm_scale::Int64, integrator_options::Dict{Symbol,Any}, optimizer_options::Dict{Symbol,Any}) at .\In[60]:317
 [57964] top-level scope at In[61]:15

It did work, however, when I removed the inequality constraints. Does NewtonTrustRegion() not support inequality constraints? (I checked the docs but they only gave one very short example for this method, and it was without any constraints.)

I assume it doesn’t support constraints, but it should throw a nicer error. You should open an issue: Issues · JuliaNLSolvers/Optim.jl · GitHub

1 Like

Thanks, I just opened an issue here.

I also think the documentation could clearer about the capabilities of the different algorithms, particularly regarding the types of constraints that each supports/doesn’t support. It’d be nice to have a table like this: https://www.mathworks.com/help/optim/ug/problems-handled-by-optimization-toolbox-functions.html

Open an issue if you have suggestions to improve the documentation.

Even better, make a pull request editing the documentation. Here’s the algorithm choice section:

and here’s the home page:

You can click the pencil icon to edit online. Follow the instructions at the bottom of the page to make a pull request.

2 Likes

Thanks for the suggestions. For now I’ll open an issue and include some broad suggestions for how I think the documentation could be improved. Since I’m still a Julia newbie, I don’t know enough about the Optim package to be able to actually revise the documentation…

i did a patch supporting AD on constraints, basically, the “hessian” here, is not a hessian, but an operation over a hessian, adding the lagrange multipliers. a simplified version of what the jacobian and hessian of constraints is doing can be found in this post:

1 Like

This has actually been implemented but poorly documented. I was tipped off to this by seeing the package author commenting somewhere that they fixed this, but given I can’t seem to find that post on Discourse it’s likely worth copying.

The key is that there is a parameter in the TwiceDifferentiableConstraint constructor that allows for autodiff, but unlike in TwiceDifferentiable it is not a keyword variable.

Here’s a MWE based on the IPNewton example

using Optim, NLSolversBase #hide
import NLSolversBase: clear! #hide
import ForwardDiff as Diff

fun(x) =  (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2

x0 = [0.25, 0.25]
lc = [-Inf, 0.0]; uc = [0.5^2, 0.0]
lx = [-0.5, -0.5]; ux = [0.5, 0.5]

function con2_c!(c, x)
    c[1] = x[1]^2 + x[2]^2     ## First constraint
    c[2] = x[2]*sin(x[1])-x[1] ## Second constraint
    return c
end

df = TwiceDifferentiable(fun, x0, autodiff = :forwarddiff)
dfc = TwiceDifferentiableConstraints(con2_c!,
                                     lx, ux, lc, uc, :forwarddiff)
res = optimize(df, dfc, x0, IPNewton())
println(res)
println(res.minimizer)
4 Likes