Reformulation perspectification technique: a code example using JuMP

In this tutorial we aim to solve the following nonconvex program (NC) via a convex relaxation (CR).


@variable(NC, x[1:2])

@constraint(NC, Lin1, x[1] + x[2] <= 1)

E = 1 + exp(-1)

@constraint(NC, Conv1, exp(-x[1]) + exp(-x[2]) <= E)

function nl(x) return -log(x) end # this function defined is convex

function concave_term(x) return (x + 2) * nl(x + 2) end # ∵ x -> x * log(x) is convex

function primal_obj_f(x) return 2 * x[1] + 3 * x[2] - 5 * x[1] * x[2] + concave_term(x[1]) end

@objective(NC, Min, primal_obj_f(x))

Since there is a concave_term, and it’s MIN_SENSE, the (NC) is nonconvex.

Nonetheless, in a lifted space, this undesirable term can be made convex.

Before this is discussed, we conclude from Conv1 that exp(-x[1]) <= E, which implies x[1] + 2 >= -0.3133 + 2 > 0, whence the validity of the logarithm in concave_term(x[1]).

We proceed:


# if x + 2 > 0, we have the following reformulation for `concave_term`, provided that X == x * x

function convex_term(x, X) return (x + 2) * nl((X + 4 * x + 4) / (x + 2)) end # ∵ convexity due to being a perspective function of a convex function

This motivates us to introduce the auxiliary decision X to stand for the outer product x * transpose(x).


@variable(CR, X[1:2, 1:2], Symmetric)

We rewrite constraint Lin1 as


@expression(CR, L1, 1 - x[1] - x[2])

@constraint(CR, Lin1, L1 >= 0)

We can multiply both sides of Conv1 by L1, which yields a valid inequality


@constraint(CR, Valid1, L1 * exp((X[1, 1] + X[1, 2] - x[1]) / L1) + L1 * exp((X[1, 2] + X[2, 2] - x[2]) / L1) <= E * L1)

It is known from Mosek’s cookbook that in JuMP.jl, the feasibility system

a[2] * exp(a[1] / a[2]) <= a[3]; a[2] > 0, a[3] > 0

or

a[2] * nl(a[3] / a[2]) <= -a[1]; a[2] > 0, a[3] > 0

can be coded as

[a[1], a[2], a[3]] in JuMP.MOI.ExponentialCone().

Here we suffice to introduce the code on the convex relaxation (CR).

Note that the following code is readily runnable.


# start a pristine julia REPL

import JuMP

import MosekTools # as the conic solver

E = 1 + exp(-1);

CR = JuMP.Model(() -> MosekTools.Optimizer())

JuMP.@variable(CR, x[1:2]); # the primal decisions

# the following 2 lines are routinely written

JuMP.@variable(CR, X[eachindex(x), eachindex(x)], Symmetric);

JuMP.@constraint(CR, [1 transpose(x); x X] >= 0, JuMP.PSDCone()); # SDP cut

# We add the constraint `Lin1`

JuMP.@expression(CR, L1, 1 - x[1] - x[2])

JuMP.@constraint(CR, Lin1, L1 >= 0)

# To proceed with the `exp` function, we need some auxiliary epigraphical variables, named t

JuMP.@variable(CR, t[1:5]);

# We add the constraint `Conv1` through the following 3 lines with the help of `t[4:5]`

JuMP.@constraint(CR, t[4] + t[5] <= E)

JuMP.@constraint(CR, [-x[1], 1, t[4]] in JuMP.MOI.ExponentialCone())

JuMP.@constraint(CR, [-x[2], 1, t[5]] in JuMP.MOI.ExponentialCone())

# We add the constraint `Valid1` through the following 3 lines with the help of `t[2:3]`

JuMP.@constraint(CR, [X[1, 1] + X[1, 2] - x[1], L1, t[2]] in JuMP.MOI.ExponentialCone())

JuMP.@constraint(CR, [X[1, 2] + X[2, 2] - x[2], L1, t[3]] in JuMP.MOI.ExponentialCone())

JuMP.@constraint(CR, t[2] + t[3] <= E * L1)

# We build the `concave_term` in the objective function through the following 3 lines with the help of `t[1]`

JuMP.@expression(CR, L2, x[1] + 2);

nothing; # we can add a `Lin2` constraint specifying `L2 >= 0` here, like `Lin1`, but we've proved that this is already implied.

JuMP.@constraint(CR, [-t[1], L2, X[1, 1] + 4 * x[1] + 4] in JuMP.MOI.ExponentialCone())

# We finish the objective

JuMP.@objective(CR, Min, 2 * x[1] + 3 * x[2] - 5 * X[1, 2] + t[1])

JuMP.optimize!(CR)

@assert JuMP.termination_status(CR) == JuMP.OPTIMAL

lb = JuMP.objective_bound(CR)

function nl(x) return -log(x) end

function concave_term(x) return (x + 2) * nl(x + 2) end

function primal_obj_f(x) return 2 * x[1] + 3 * x[2] - 5 * x[1] * x[2] + concave_term(x[1]) end

feasible_x_returned_by_CR = JuMP.value.(x) # [0.8030754682915741, 0.1969245334462786]

ub = primal_obj_f(feasible_x_returned_by_CR)

# We observe that lb = -1.4829799922508433 < -1.4829798593248442 = ub

# Given this accuracy, we conclude that the primal nonconvex problem (NC) can be solved to global optimality via (CR)

This example was from

An extension of the Reformulation-Linearization Technique to nonlinear optimization – Optimization Online.

Although in this example we can reach global optimality thanks to the SDP cut

JuMP.@constraint(CR, [1 transpose(x); x X] in JuMP.PSDCone());

In reality where length(x) is large, the scalability of SDP may be poor.
We then have 2 options to relax (CR):

  1. Neglect the SDP cut
  2. Neglect the SDP cut while add the trivial bound as follows
JuMP.set_lower_bound.([X[i, i] for i in eachindex(x)], 0);

We define the following numerical vector to facilitate reference.
v = [-35.17, -2.19, -1.48, -1.29, -1.04]
It’s clear that v[3] is the global optimal value.

The results of the experiment suggest:

  • If we adopt the 1st option, we can trap v[3] with [v[1], v[4]].
  • If we adopt the 2nd option, we can trap v[3] with [v[2], v[5]].

As it is supposed, The 2nd option yields a stronger lower bound than the 1st option. Yet, we find that on the primal side, the upper bound of the 2nd option is inferior. This suggests that: a better dual bound is simply derived by adding more valid cuts, while we want to collect good primal solutions along the way.

A trivial method of (potentially) improving the solution returned from (CR) is to hand it to Ipopt (solving (NC)) to derive another locally optimal solution:

JuMP.set_start_value.(x, x_returned_by_CR) # `x` here is of (NC)'s

As such, we might obtain a better upper bound (it works in this case).