In this tutorial we aim to solve the following nonconvex program (NC) via a convex relaxation (CR).
@variable(NC, x[1:2])
@constraint(NC, Lin1, x[1] + x[2] <= 1)
E = 1 + exp(-1)
@constraint(NC, Conv1, exp(-x[1]) + exp(-x[2]) <= E)
function nl(x) return -log(x) end # this function defined is convex
function concave_term(x) return (x + 2) * nl(x + 2) end # ∵ x -> x * log(x) is convex
function primal_obj_f(x) return 2 * x[1] + 3 * x[2] - 5 * x[1] * x[2] + concave_term(x[1]) end
@objective(NC, Min, primal_obj_f(x))
Since there is a concave_term
, and it’s MIN_SENSE, the (NC) is nonconvex.
Nonetheless, in a lifted space, this undesirable term can be made convex.
Before this is discussed, we conclude from Conv1
that exp(-x[1]) <= E
, which implies x[1] + 2 >= -0.3133 + 2 > 0
, whence the validity of the logarithm in concave_term(x[1])
.
We proceed:
# if x + 2 > 0, we have the following reformulation for `concave_term`, provided that X == x * x
function convex_term(x, X) return (x + 2) * nl((X + 4 * x + 4) / (x + 2)) end # ∵ convexity due to being a perspective function of a convex function
This motivates us to introduce the auxiliary decision X
to stand for the outer product x * transpose(x)
.
@variable(CR, X[1:2, 1:2], Symmetric)
We rewrite constraint Lin1
as
@expression(CR, L1, 1 - x[1] - x[2])
@constraint(CR, Lin1, L1 >= 0)
We can multiply both sides of Conv1
by L1
, which yields a valid inequality
@constraint(CR, Valid1, L1 * exp((X[1, 1] + X[1, 2] - x[1]) / L1) + L1 * exp((X[1, 2] + X[2, 2] - x[2]) / L1) <= E * L1)
It is known from Mosek’s cookbook that in JuMP.jl, the feasibility system
a[2] * exp(a[1] / a[2]) <= a[3]; a[2] > 0, a[3] > 0
or
a[2] * nl(a[3] / a[2]) <= -a[1]; a[2] > 0, a[3] > 0
can be coded as
[a[1], a[2], a[3]] in JuMP.MOI.ExponentialCone()
.
Here we suffice to introduce the code on the convex relaxation (CR).
Note that the following code is readily runnable.
# start a pristine julia REPL
import JuMP
import MosekTools # as the conic solver
E = 1 + exp(-1);
CR = JuMP.Model(() -> MosekTools.Optimizer())
JuMP.@variable(CR, x[1:2]); # the primal decisions
# the following 2 lines are routinely written
JuMP.@variable(CR, X[eachindex(x), eachindex(x)], Symmetric);
JuMP.@constraint(CR, [1 transpose(x); x X] >= 0, JuMP.PSDCone()); # SDP cut
# We add the constraint `Lin1`
JuMP.@expression(CR, L1, 1 - x[1] - x[2])
JuMP.@constraint(CR, Lin1, L1 >= 0)
# To proceed with the `exp` function, we need some auxiliary epigraphical variables, named t
JuMP.@variable(CR, t[1:5]);
# We add the constraint `Conv1` through the following 3 lines with the help of `t[4:5]`
JuMP.@constraint(CR, t[4] + t[5] <= E)
JuMP.@constraint(CR, [-x[1], 1, t[4]] in JuMP.MOI.ExponentialCone())
JuMP.@constraint(CR, [-x[2], 1, t[5]] in JuMP.MOI.ExponentialCone())
# We add the constraint `Valid1` through the following 3 lines with the help of `t[2:3]`
JuMP.@constraint(CR, [X[1, 1] + X[1, 2] - x[1], L1, t[2]] in JuMP.MOI.ExponentialCone())
JuMP.@constraint(CR, [X[1, 2] + X[2, 2] - x[2], L1, t[3]] in JuMP.MOI.ExponentialCone())
JuMP.@constraint(CR, t[2] + t[3] <= E * L1)
# We build the `concave_term` in the objective function through the following 3 lines with the help of `t[1]`
JuMP.@expression(CR, L2, x[1] + 2);
nothing; # we can add a `Lin2` constraint specifying `L2 >= 0` here, like `Lin1`, but we've proved that this is already implied.
JuMP.@constraint(CR, [-t[1], L2, X[1, 1] + 4 * x[1] + 4] in JuMP.MOI.ExponentialCone())
# We finish the objective
JuMP.@objective(CR, Min, 2 * x[1] + 3 * x[2] - 5 * X[1, 2] + t[1])
JuMP.optimize!(CR)
@assert JuMP.termination_status(CR) == JuMP.OPTIMAL
lb = JuMP.objective_bound(CR)
function nl(x) return -log(x) end
function concave_term(x) return (x + 2) * nl(x + 2) end
function primal_obj_f(x) return 2 * x[1] + 3 * x[2] - 5 * x[1] * x[2] + concave_term(x[1]) end
feasible_x_returned_by_CR = JuMP.value.(x) # [0.8030754682915741, 0.1969245334462786]
ub = primal_obj_f(feasible_x_returned_by_CR)
# We observe that lb = -1.4829799922508433 < -1.4829798593248442 = ub
# Given this accuracy, we conclude that the primal nonconvex problem (NC) can be solved to global optimality via (CR)
This example was from