How to optimize trace of matrix inverse with JuMP or Convex?

biona001 · February 6, 2023, 9:25pm

Hi community,

I want to minimize tr(S^{-1}) + tr((A - S)^{-1}) such that S \succeq 0 and A - S \succeq 0 (S is the optimization matrix, A is some fixed positive definite matrix)
I tried using JuMP + Hypatia:

using Hypatia, JuMP
p = 5
x = randn(p, p)
A = Symmetric(x' * x)

# model and constraints constraints
model = Model(() -> Hypatia.Optimizer())
@variable(model, S[1:p, 1:p], Symmetric)
@constraint(model, S in PSDCone())
@constraint(model, A - S in PSDCone())

# objective
@NLobjective(model, Min, tr(inv(S)) + tr(inv(A - S)))

# solve and return
JuMP.optimize!(model)

which gives me the following error:

Unexpected array VariableRef[S[1,1] S[1,2] S[1,3] S[1,4] S[1,5]; S[1,2] S[2,2] S[2,3] S[2,4] S[2,5]; S[1,3] S[2,3] S[3,3] S[3,4] S[3,5]; S[1,4] S[2,4] S[3,4] S[4,4] S[4,5]; S[1,5] S[2,5] S[3,5] S[4,5] S[5,5]] in nonlinear expression. Nonlinear expressions may contain only scalar expressions.

Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] parse_expression(#unused#::MathOptInterface.Nonlinear.Model, expr::MathOptInterface.Nonlinear.Expression, x::Symmetric{VariableRef, Matrix{VariableRef}}, parent_index::Int64)
    @ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/cl3eR/src/Nonlinear/parse.jl:235
  [3] parse_expression(data::MathOptInterface.Nonlinear.Model, expr::MathOptInterface.Nonlinear.Expression, x::Expr, parent_index::Int64)
    @ MathOptInterface.Nonlinear ~/.julia/packages/MathOptInterface/cl3eR/src/Nonlinear/parse.jl:52
  [4] parse_expression
    @ ~/.julia/packages/MathOptInterface/cl3eR/src/Nonlinear/parse.jl:14 [inlined]
  [5] set_objective
    @ ~/.julia/packages/MathOptInterface/cl3eR/src/Nonlinear/model.jl:43 [inlined]
  [6] set_nonlinear_objective(model::Model, sense::MathOptInterface.OptimizationSense, x::Expr)
    @ JuMP ~/.julia/packages/JuMP/puvTM/src/nlp.jl:163
  [7] macro expansion
    @ ~/.julia/packages/JuMP/puvTM/src/macros.jl:2332 [inlined]
  [8] top-level scope
    @ In[4]:12
  [9] eval
    @ ./boot.jl:373 [inlined]
 [10] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
    @ Base ./loading.jl:1196

I also tried to define some variable t such that t \ge tr(S^{-1}), but there I get a no method matching VariableRef(::AffExpr) error.

Seems like CVX supports trace_inv, is there a similar alternative in JuMP or Convex.jl? I couldn’t find one.

Any help is highly appreciated.

odow · February 6, 2023, 10:00pm

I don’t know if there’s an easy way to do this in JuMP. You can’t call arbitrary functions in the @NL like inv: Nonlinear Modeling · JuMP, and even if you could, Hypatia doesn’t support the nonlinear interface.

There might be some relationship you can use to reformulate tr(inv(X)) into something else, but I’m not sure. Someone else might have a good idea.

Also, as a tip, you can change your model to the slightly more compact form:

model = Model(Hypatia.Optimizer)
@variable(model, S[1:p, 1:p], PSD)
@constraint(model, A >= S, PSDCone())

biona001 · February 6, 2023, 11:46pm

Thank you for your quick response. I found this post which recommends I do the following

using Hypatia, JuMP, LinearAlgebra
p = 5
x = randn(p, p)
A = Symmetric(x' * x)

# model and constraints constraints
model = Model(() -> Hypatia.Optimizer())
@variable(model, S[1:p, 1:p], PSD)
@constraint(model, A >= S, PSDCone())

# introduce X and Y
@variable(model, X[1:p, 1:p])
@variable(model, Y[1:p, 1:p])
@constraint(model, [X I; I S] in PSDCone())
@constraint(model, [Y I; I A-S] in PSDCone())

# objective
@objective(model, Min, tr(X) + tr(Y))

# solve and return
JuMP.optimize!(model)

which seems to work. The idea is that instead of working with tr(S^{-1}) in the objective, we work with tr(X) where \begin{bmatrix}X & I\\ I & S\end{bmatrix} \succeq 0. Then by Schur complement argument, \begin{bmatrix}X & I\\ I & S\end{bmatrix} \succeq 0 \iff X - S^{-1} \succeq 0 \iff X \succeq S^{-1} \iff tr(X) \ge tr(S^{-1}).

odow · February 7, 2023, 12:57am

Nice! That’s a useful trick to know.

lkapelevich · February 7, 2023, 10:58pm

You could try to optimize over trace inverse directly. One of the more complicated cones Hypatia has is EpiPerSepSpectralCone, which is parameterized by a domain and a “separable spectral function”. For this case, the separable spectral function of interest is the trace inverse. The domain of interest is the PSD cone.

There is no exported function type for “trace inverse” but that’s because it is the convex conjugate of negative-trace-squareroot, NegSqrtSSF (scaled by a factor of 4). You can make the following dual cone:

cone = EpiPerSepSpectralCone{Float64}(Hypatia.Cones.NegSqrtSSF(), Hypatia.Cones.MatrixCSqr{Float64, Float64}, p, true)

For reference, see table 1: https://arxiv.org/pdf/2103.04104.pdf.
For your model:

model = Model(Hypatia.Optimizer)
@variable(model, t1)
@variable(model, t2)
@variable(model, S[1:p, 1:p], Symmetric)
@constraint(model, vcat(1.0, t1, [(A[i, j] - S[i, j])  * (i == j ? 1 : sqrt(2)) for i in 1:p for j in 1:i]...) in cone)
@constraint(model, vcat(1.0, t2, [S[i, j] * (i == j ? 1 : sqrt(2)) for i in 1:p for j in 1:i]...) in cone)
@objective(model, Min, 4 * (t1 + t2))

A-S in PSDCone() and S in PSDCone() are implied in the cone constraints.

biona001 · February 8, 2023, 2:08am

@lkapelevich Thank you for the suggestion.

I tried your approach on the same problem. Answers agree, and when p=50, your solution took ~120 seconds on my machine, while my original solution took ~960 seconds. So it’s definitely worth looking into, although I’m not really sure what you are doing. I’ll have to spend more time on it. Thank you so much!!

lkapelevich · February 10, 2023, 1:03am

I skipped a lot of steps, happy to clarify anything. For:

cone = EpiPerSepSpectralCone{T}(
    h::Cones.SepSpectralFun,
    Q::Type{<:Cones.ConeOfSquares{T}},
    d::Int,
	use_dual::Bool,
	)

we are making a cone that is the closure of:
(u, v, w) : v > 0, w \in Q, u \geq v \cdot h(w / v)
if use_dual is false, and
(u, v, w) : u > 0, w \in Q, v \geq u \cdot h^\ast(w / u)
if use_dual is true.

I picked h = NegSqrtSSF() and Q = MatrixCSqr{Float64, Float64}- which is like the PSD matrices. Then:

@constraint(model, vcat(1.0, t2, [S[i, j] * (i == j ? 1 : sqrt(2)) for i in 1:p for j in 1:i]...) in cone)

says t2 >= trace inverse of S and S is PSD. The matrix variable S is vectorized and scaled because that’s how matrix variables are accepted natively.

Topic		Replies	Views
Matrix inverse as objective Optimization (Mathematical) jump , optim	2	80	May 20, 2025
Matrix inversion inside JuMP objective Optimization (Mathematical) jump	4	371	September 26, 2023
Handling nonlinear objective involving the inverse of a matrix Optimization (Mathematical)	2	82	December 11, 2024
Matrix inverse in constraints in `@NLconstraint` Optimization (Mathematical) jump , optimization , nonlinear	3	381	October 9, 2022
Solve for Hermitian PSD with non-linear objective in JuMP Optimization (Mathematical) jump	19	1110	May 27, 2021

How to optimize trace of matrix inverse with JuMP or Convex?

Related topics