JuMP: Squared Frobenius norm objective

Hi. I can’t seem to find a definitive answer so I’m posting here for help. What is the standard and latest way in JuMP to construct an objective function for a Frobenius norm?

Say I have a (fictional) minimization of

\min_{P\in\mathbb{R}^{n\times n}} \| \mathbf{PAX} - \mathbf{QX} \|_F^2

where \mathbf{X}\in\mathbb{R}^{n\times K}, K > n, is some data matrix. And, the matrices \mathbf{A}\in\mathbb{R}^{n\times n}, \mathbf{Q}=\mathbf{Q}^\top\in\mathbb{R}^{n\times n} are given. Also assume \mathbf{P} = \mathbf{P}^\top.

Would the code to solve this be like the following?

n = 5
K = 100
X = rand(n, K)
A = rand(5,5)
Q = rand(5,5)
Q *= Q'  
model = Model(Ipopt.Optimizer)
@variable(model, P[1:n, 1:n], Symmetric)
@objective(model, sum((P*A*X - Q*X).^2))
optimize!(model)

Or do I have to use the MOI.SecondOrderCone() as in the example below (for a vector case)?

@constraint(model, [t; x] in SecondOrderCone())

I think what you’ve done is reasonable.
Below is how I would write your example, plus a SecondOrderCone version as you suggested:

using JuMP

import Ipopt
import LinearAlgebra
import SCS

n = 5
K = 100

## Generate data:
X = rand(n, K)
A = rand(5,5)
Q = LinearAlgebra.Symmetric(rand(5,5))

## Nonlinear version:
solver = Ipopt
model = Model(solver.Optimizer)
@variable(model, P[1:n, 1:n], Symmetric)
@objective(model, Min, sum((P*A*X - Q*X).^2));
optimize!(model)
solution_summary(model)

z_nl = objective_value(model)
P_nl = value.(P)

## SoC version:
solver = SCS
model = Model(solver.Optimizer)
@variable(model, P[1:n, 1:n], Symmetric)
@variable(model, t)
@constraint(model, [t; vec(P*A*X - Q*X)] in SecondOrderCone());
@objective(model, Min, t)
optimize!(model)
solution_summary(model)

z_soc = objective_value(model)^2
P_soc = value.(P)

## Compare
isapprox(z_soc, z_nl, rtol=1e-3)

LinearAlgebra.norm(P_soc - P_nl) <= 1e-3

While you’ve probably just given a simplified problem, this type of problem has an “exact” solution using the Moore–Penrose inverse:


(see Ben-Israel, A., & Greville, T. N. (2003). Generalized inverses: theory and applications (Vol. 15). Springer)
Also the Ref 636

## "Exact" version
P_pi = Q*X*LinearAlgebra.pinv(A*X)
z_pi = LinearAlgebra.norm(P_pi*A*X - Q*X, 2)

# Compare to:
LinearAlgebra.norm(P_soc*A*X - Q*X, 2)
LinearAlgebra.norm(P_nl *A*X - Q*X, 2)

Hopefully you observe how much smaller the residual is in the exact pseudoinverse version.

2 Likes

@jd-foster Thanks for the detailed explanation. And, yes I agree for this simplified example an exact solution does exist. Thanks for that clarification as well.

Do you know if using the SecondOrderCone() would make the optimization more or less efficient (in the case of using a solver like SCS)? I’m curious about what the most efficient implementation would be like.

In the case of SCS, I would guess that the second one would be most efficient since the solver targets cones in the algorithm design. Using a general nonlinear solver like Ipopt, the first form is best I believe.

The way to really know would be to benchmark over increasing test problem sizes while tracking the accuracy against the known solution.
@odow Thoughts?

You probably want RotatedSecondOrderCone instead of SecondOrderCone. See Tips and Tricks · JuMP.

“Most efficient” is probably problem-dependent. You should just try a few different formulations. Ipopt.jl does not support the conic formulation.

using JuMP
import Ipopt
import LinearAlgebra
import SCS

function test_ipopt(X, A, Q)
    n = size(X, 1)
    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, P[1:n, 1:n], Symmetric)
    @objective(model, Min, sum((P*A*X - Q*X).^2));
    optimize!(model)
    @assert is_solved_and_feasible(model)
    return objective_value(model), value.(P)
end

function test_ipopt2(X, A, Q)
    n, K = size(X)
    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, P[1:n, 1:n], Symmetric)
    @variable(model, residuals[1:n, 1:K])
    @constraint(model, residuals .== P*A*X - Q*X)
    @objective(model, Min, sum(residuals.^2));
    optimize!(model)
    @assert is_solved_and_feasible(model)
    return objective_value(model), value.(P)
end

function test_scs(X, A, Q)
    n = size(X, 1)
    model = Model(SCS.Optimizer)
    set_silent(model)
    @variable(model, P[1:n, 1:n], Symmetric)
    @variable(model, t)
    @constraint(model, [t; 0.5; vec(P*A*X - Q*X)] in RotatedSecondOrderCone());
    @objective(model, Min, t)
    optimize!(model)
    return objective_value(model), value.(P)
end

n, K = 5, 100
X = rand(n, K);
A = rand(5, 5);
Q = LinearAlgebra.Symmetric(rand(5, 5))

@time test_ipopt(X, A, Q)
@time test_ipopt2(X, A, Q)
@time test_scs(X, A, Q)
2 Likes

@odow For Ipopt, since I’m dealing with high-dimensionalities, constraining the residual as in the second example you gave me took much less time to set up the optimization problem compared to the first one. I’m guessing this is because Ipopt works well with constraints.

Thanks for the tips!

1 Like