# How to optimize trace of matrix inverse with JuMP or Convex?

Hi community,

I want to minimize tr(S^{-1}) + tr((A - S)^{-1}) such that S \succeq 0 and A - S \succeq 0 (S is the optimization matrix, A is some fixed positive definite matrix)
I tried using JuMP + Hypatia:

using Hypatia, JuMP
p = 5
x = randn(p, p)
A = Symmetric(x' * x)

# model and constraints constraints
model = Model(() -> Hypatia.Optimizer())
@variable(model, S[1:p, 1:p], Symmetric)
@constraint(model, S in PSDCone())
@constraint(model, A - S in PSDCone())

# objective
@NLobjective(model, Min, tr(inv(S)) + tr(inv(A - S)))

# solve and return
JuMP.optimize!(model)


which gives me the following error:

Unexpected array VariableRef[S[1,1] S[1,2] S[1,3] S[1,4] S[1,5]; S[1,2] S[2,2] S[2,3] S[2,4] S[2,5]; S[1,3] S[2,3] S[3,3] S[3,4] S[3,5]; S[1,4] S[2,4] S[3,4] S[4,4] S[4,5]; S[1,5] S[2,5] S[3,5] S[4,5] S[5,5]] in nonlinear expression. Nonlinear expressions may contain only scalar expressions.

Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] parse_expression(#unused#::MathOptInterface.Nonlinear.Model, expr::MathOptInterface.Nonlinear.Expression, x::Symmetric{VariableRef, Matrix{VariableRef}}, parent_index::Int64)
@ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/cl3eR/src/Nonlinear/parse.jl:235
[3] parse_expression(data::MathOptInterface.Nonlinear.Model, expr::MathOptInterface.Nonlinear.Expression, x::Expr, parent_index::Int64)
@ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/cl3eR/src/Nonlinear/parse.jl:52
[4] parse_expression
@ ~/.julia/packages/MathOptInterface/cl3eR/src/Nonlinear/parse.jl:14 [inlined]
[5] set_objective
@ ~/.julia/packages/MathOptInterface/cl3eR/src/Nonlinear/model.jl:43 [inlined]
[6] set_nonlinear_objective(model::Model, sense::MathOptInterface.OptimizationSense, x::Expr)
@ JuMP ~/.julia/packages/JuMP/puvTM/src/nlp.jl:163
[7] macro expansion
@ ~/.julia/packages/JuMP/puvTM/src/macros.jl:2332 [inlined]
[8] top-level scope
@ In[4]:12
[9] eval
@ ./boot.jl:373 [inlined]
[10] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)


I also tried to define some variable t such that t \ge tr(S^{-1}), but there I get a no method matching VariableRef(::AffExpr) error.

Seems like CVX supports trace_inv, is there a similar alternative in JuMP or Convex.jl? I couldnâ€™t find one.

Any help is highly appreciated.

I donâ€™t know if thereâ€™s an easy way to do this in JuMP. You canâ€™t call arbitrary functions in the @NL like inv: Nonlinear Modeling Â· JuMP, and even if you could, Hypatia doesnâ€™t support the nonlinear interface.

There might be some relationship you can use to reformulate tr(inv(X)) into something else, but Iâ€™m not sure. Someone else might have a good idea.

Also, as a tip, you can change your model to the slightly more compact form:

model = Model(Hypatia.Optimizer)
@variable(model, S[1:p, 1:p], PSD)
@constraint(model, A >= S, PSDCone())


Thank you for your quick response. I found this post which recommends I do the following

using Hypatia, JuMP, LinearAlgebra
p = 5
x = randn(p, p)
A = Symmetric(x' * x)

# model and constraints constraints
model = Model(() -> Hypatia.Optimizer())
@variable(model, S[1:p, 1:p], PSD)
@constraint(model, A >= S, PSDCone())

# introduce X and Y
@variable(model, X[1:p, 1:p])
@variable(model, Y[1:p, 1:p])
@constraint(model, [X I; I S] in PSDCone())
@constraint(model, [Y I; I A-S] in PSDCone())

# objective
@objective(model, Min, tr(X) + tr(Y))

# solve and return
JuMP.optimize!(model)


which seems to work. The idea is that instead of working with tr(S^{-1}) in the objective, we work with tr(X) where \begin{bmatrix}X & I\\ I & S\end{bmatrix} \succeq 0. Then by Schur complement argument, \begin{bmatrix}X & I\\ I & S\end{bmatrix} \succeq 0 \iff X - S^{-1} \succeq 0 \iff X \succeq S^{-1} \iff tr(X) \ge tr(S^{-1}).

8 Likes

Nice! Thatâ€™s a useful trick to know.

You could try to optimize over trace inverse directly. One of the more complicated cones Hypatia has is EpiPerSepSpectralCone, which is parameterized by a domain and a â€śseparable spectral functionâ€ť. For this case, the separable spectral function of interest is the trace inverse. The domain of interest is the PSD cone.

There is no exported function type for â€śtrace inverseâ€ť but thatâ€™s because it is the convex conjugate of negative-trace-squareroot, NegSqrtSSF (scaled by a factor of 4). You can make the following dual cone:

cone = EpiPerSepSpectralCone{Float64}(Hypatia.Cones.NegSqrtSSF(), Hypatia.Cones.MatrixCSqr{Float64, Float64}, p, true)


For reference, see table 1: https://arxiv.org/pdf/2103.04104.pdf.

model = Model(Hypatia.Optimizer)
@variable(model, t1)
@variable(model, t2)
@variable(model, S[1:p, 1:p], Symmetric)
@constraint(model, vcat(1.0, t1, [(A[i, j] - S[i, j])  * (i == j ? 1 : sqrt(2)) for i in 1:p for j in 1:i]...) in cone)
@constraint(model, vcat(1.0, t2, [S[i, j] * (i == j ? 1 : sqrt(2)) for i in 1:p for j in 1:i]...) in cone)
@objective(model, Min, 4 * (t1 + t2))


A-S in PSDCone() and S in PSDCone() are implied in the cone constraints.

5 Likes

@lkapelevich Thank you for the suggestion.

I tried your approach on the same problem. Answers agree, and when p=50, your solution took ~120 seconds on my machine, while my original solution took ~960 seconds. So itâ€™s definitely worth looking into, although Iâ€™m not really sure what you are doing. Iâ€™ll have to spend more time on it. Thank you so much!!

3 Likes

I skipped a lot of steps, happy to clarify anything. For:

cone = EpiPerSepSpectralCone{T}(
h::Cones.SepSpectralFun,
Q::Type{<:Cones.ConeOfSquares{T}},
d::Int,
use_dual::Bool,
)


we are making a cone that is the closure of:
(u, v, w) : v > 0, w \in Q, u \geq v \cdot h(w / v)
if use_dual is false, and
(u, v, w) : u > 0, w \in Q, v \geq u \cdot h^\ast(w / u)
if use_dual is true.

I picked h = NegSqrtSSF() and Q = MatrixCSqr{Float64, Float64}- which is like the PSD matrices. Then:

@constraint(model, vcat(1.0, t2, [S[i, j] * (i == j ? 1 : sqrt(2)) for i in 1:p for j in 1:i]...) in cone)


says t2 >= trace inverse of S and S is PSD. The matrix variable S is vectorized and scaled because thatâ€™s how matrix variables are accepted natively.

2 Likes