# Minimisation of operator norm—solution does not match evaluated value [Convex.jl]

I’m using Convex.jl to minimize the operator norm of a matrix (minimize opnorm(J,Inf)), but the solution (problem.optval) does not match the evaluated value after performing the optimisation (opnorm(evaluate(J,Inf))).

A simplified example follows below. The matrix takes a specific form (a Choi matrix of the difference of two quantum channels), and would depend on the problem at hand, but the randomly-generated matrices demonstrates the issue.

using Convex, SCS, LinearAlgebra

function opt_norm_issue(U,K;d₁,d₂)
ρ = Semidefinite(d₂)
J = sum(kron(
((1:d₁).==j)*((1:d₁).==k)',
partialtrace(
U'*(
K*kron(((1:d₁).==j)*((1:d₁).==k)', I(d₂))*K' -
kron(((1:d₁).==j)*((1:d₁).==k)', ρ)
)*U
, 2, [d₁,d₂])
) for j ∈ 1:d₁, k ∈ 1:d₁)

constraints = [
tr(ρ) == 1
]

optimizer = SCS.Optimizer()
Convex.MOI.set(optimizer, Convex.MOI.RawOptimizerAttribute("eps_abs"), 1E-9)
Convex.MOI.set(optimizer, Convex.MOI.RawOptimizerAttribute("eps_rel"), 1E-12)

problem = minimize(opnorm(J,Inf), constraints)
solve!(problem, optimizer)

return problem.optval, opnorm(evaluate(J), Inf)
end

Running the following lines a few times with different values of d₁ and d₂ show that problem.optval and opnorm(evaluate(J),Inf) do not match, and can be quite different for some K and U.

d₁,d₂ = 2,2;
H = rand(ComplexF64,d₁*d₂,d₁*d₂);
U = exp(im*π*(H+H'));
K = rand(ComplexF64,d₁*d₂,d₁*d₂);
K *= d₁/tr(K'K);

opt_norm_issue(U,K;d₁,d₂)

Any ideas why this is happening? It’s particularly problematic because I need the arguments of the optimal result, but the arguments do not actually give the (apparent) optimal value.

1 Like

Hi there, welcome to the forum.

This is an interesting problem. For others reading, this is what I get:

julia> using Convex, SCS, LinearAlgebra

julia> const MOI = Convex.MOI
MathOptInterface

julia> function opt_norm_issue(U, K; d₁, d₂)
ρ = Semidefinite(d₂)
J = sum(
kron(
((1:d₁).==j)*((1:d₁).==k)',
partialtrace(
U'*(
K*kron(((1:d₁).==j)*((1:d₁).==k)', I(d₂))*K' -
kron(((1:d₁).==j)*((1:d₁).==k)', ρ)
)*U
, 2, [d₁,d₂])
)
for j in 1:d₁, k in 1:d₁
)
constraints = [LinearAlgebra.tr(ρ) == 1]
solver = SCS.Optimizer()
MOI.set(solver, MOI.RawOptimizerAttribute("eps_abs"), 1E-9)
MOI.set(solver, MOI.RawOptimizerAttribute("eps_rel"), 1E-12)
problem = minimize(opnorm(J, Inf), constraints)
solve!(problem, solver)
return problem.optval, opnorm(evaluate(J), Inf)
end
opt_norm_issue (generic function with 1 method)

julia> d₁, d₂ = 2, 2;

julia> H = rand(ComplexF64, d₁ * d₂, d₁ * d₂);

julia> U = exp(im * π * (H + H'));

julia> K = rand(ComplexF64, d₁ * d₂, d₁ * d₂);

julia> K *= d₁ / LinearAlgebra.tr(K' * K);

julia> opt_norm_issue(U, K; d₁, d₂)
------------------------------------------------------------------
SCS v3.2.3 - Splitting Conic Solver
(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 38, constraints m: 74
cones: 	  z: primal zero / dual free vars: 3
l: linear vars: 20
q: soc vars: 48, qsize: 16
s: psd vars: 3, ssize: 1
settings: eps_abs: 1.0e-09, eps_rel: 1.0e-12, eps_infeas: 1.0e-07
alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
max_iters: 100000, normalize: 1, rho_x: 1.00e-06
acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-direct-amd-qdldl
nnz(A): 205, nnz(P): 0
------------------------------------------------------------------
iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
0| 1.12e+01  1.00e+00  5.25e+01 -2.59e+01  1.00e-01  2.82e-04
250| 2.10e-04  6.08e-04  2.61e-04  8.23e-01  5.19e-01  1.36e-03
450| 4.00e-11  2.08e-13  4.01e-11  8.23e-01  5.19e-01  2.05e-03
------------------------------------------------------------------
status:  solved
timings: total: 2.06e-03s = setup: 2.17e-04s + solve: 1.84e-03s
lin-sys: 8.52e-04s, cones: 5.85e-04s, accel: 6.00e-05s
------------------------------------------------------------------
objective = 0.823033
------------------------------------------------------------------
(0.8230325137470799, 0.8002661860842999)

We’d expect that the final two numbers are identical.

I assume that this is happening because the reformulation from the original problem into a conic form that SCS can solve introduces additional variables and constraints that allow some tolerance issues, but I wouldn’t have expected the two values to be so different. Although because it’s a matrix norm, small changes in the variable values can cause larger changes in the output.

@ericphanson might have some insight into what’s going on.

3 Likes

Thanks for checking out the issue.

I’ve since tried simplifying the optimisation to the following minimisation to isolate where the issue stems from:

\begin{aligned} \min_{t,\rho} \;&\; t,\\ \text{for} \;&\; J := \sum_{j,k} E_{jk} \otimes \left\{ U^\dagger \left[ K\left(E_{jk} \otimes I\right)K^\dagger - E_{jk} \otimes \rho \right] U\right\} \\ \text{subject to}\;&\;\operatorname{tr}\rho = 1,\\ &\; \rho \succeq 0, \\ &\; tI - J \succeq 0, \\ &\; tI + J \succeq 0. \end{aligned}

Here E_{jk} is the matrix with 1 on the j-th row, k-th column, and zero everywhere else; I is the identity matrix. This explicitly minimises the spectral norm of J. The spectral norm is unitarily invariant, so it should give the same result no matter what U is.

My findings are

• JuMP gives consistent results using the same optimiser and tolerances, so it’s a problem with Convex.jl
• The issue only occurs when both U and K are complex. It doesn’t happen when either of them is purely real or purely imaginary

Here’s my code:

using Convex, SCS, LinearAlgebra, JuMP

function opt_Convex(U,K;d₁,d₂)
ρ = Semidefinite(d₂)
t = Variable()
J = sum(kron(
((1:d₁).==j)*((1:d₁).==k)',
U'*(
K*kron(((1:d₁).==j)*((1:d₁).==k)', I(d₂))*K' -
kron(((1:d₁).==j)*((1:d₁).==k)', ρ)
)*U
) for j ∈ 1:d₁, k ∈ 1:d₁)

constraints = [
tr(ρ) == 1,
t*I(d₁^2*d₂)-J ⪰ 0,
t*I(d₁^2*d₂)+J ⪰ 0,
]

optimizer = SCS.Optimizer()
Convex.MOI.set(optimizer, Convex.MOI.RawOptimizerAttribute("eps_abs"), 1E-9)
Convex.MOI.set(optimizer, Convex.MOI.RawOptimizerAttribute("eps_rel"), 1E-12)

problem = minimize(t, constraints)
solve!(problem, optimizer)

return problem.optval, opnorm(evaluate(J), 2)
end

function opt_JuMP(U,K;d₁,d₂)
model = Model(SCS.Optimizer)
@variable(model, ρ[1:d₂,1:d₂] ∈ HermitianPSDCone())
@variable(model, t)
J = sum(kron(
((1:d₁).==j)*((1:d₁).==k)',
U'*(
K*kron(((1:d₁).==j)*((1:d₁).==k)', I(d₂))*K' -
kron(((1:d₁).==j)*((1:d₁).==k)', ρ)
)*U
) for j ∈ 1:d₁, k ∈ 1:d₁)

@constraint(model, tr(ρ)==1)
@constraint(model, Hermitian(t*I(d₁^2*d₂)-J) ∈ HermitianPSDCone())
@constraint(model, Hermitian(t*I(d₁^2*d₂)+J) ∈ HermitianPSDCone())

MOI.set(model, MOI.RawOptimizerAttribute("eps_abs"), 1E-9)
MOI.set(model, MOI.RawOptimizerAttribute("eps_rel"), 1E-12)

@objective(model, Min, t)
optimize!(model)

return objective_value(model), opnorm(value.(J), 2)
end

d₁,d₂ = 2,2;

# K[1] is purely real, K[2] is purely imaginary, K[3] is complex
K = Any[reshape(1:(d₁*d₂)^2,d₁*d₂,d₁*d₂) + im*reshape((d₁*d₂)^2+1:2*(d₁*d₂)^2,d₁*d₂,d₁*d₂) for _ ∈ 1:3];
K[3] *= sqrt(d₁/tr(K[3]'K[3]));
K[1] = real.(K[3]);
K[1] *= sqrt(d₁/tr(K[1]'K[1]));
K[2] = K[1]*im;

# The unitaries are identities with complex phases
# U[1] is purely real, U[2] is purely imaginary, U[3] is complex
U = Any[I(d₁*d₂) for _ ∈ 1:3];
U[2] = U[1]*im;
U[3] = U[1]*exp(im*π/4);

# Here you’ll find that Δarr[:,:,:,2] (JuMP) is consistent
# between the optimal and evaluated values, while there is
# a discrepancy in Δarr[:,:,:,1] (Convex.jl), but only at
# Δarr[3,3,:,1] when both U and K are complex
Δarr = zeros(3,3,2,2);
for j ∈ 1:3, k ∈ 1:3
Δarr[j,k,:,1] .= opt_Convex(U[j],K[k];d₁,d₂);
Δarr[j,k,:,2] .= opt_JuMP(U[j],K[k];d₁,d₂);
end

With the output

julia> Δarr
3×3×2×2 Array{Float64, 4}:
[:, :, 1, 1] =
1.52194  1.52194  1.44642
1.52194  1.52194  1.44642
1.52194  1.52194  1.44032

[:, :, 2, 1] =
1.52194  1.52194  1.44642
1.52194  1.52194  1.44642
1.52194  1.52194  1.44759

[:, :, 1, 2] =
1.52194  1.52194  1.44633
1.52194  1.52194  1.44633
1.52194  1.52194  1.44633

[:, :, 2, 2] =
1.52194  1.52194  1.44633
1.52194  1.52194  1.44633
1.52194  1.52194  1.44633

Each column should be consistent as they are optimised over the same K. The only discrepancy happens for Convex.jl ([:,:,1,1] and [:,:,2,1]) with both U and K complex.

Maybe it has to do with some routine that separates the real and imaginary parts when Convex.jl preprocesses the variables before sending it to the optimiser?

1 Like

You could open an issue at Issues · jump-dev/Convex.jl · GitHub to report this, but be aware that Convex.jl is currently in limited maintenance mode (GitHub - jump-dev/Convex.jl: A Julia package for disciplined convex programming), so it’s unlikely to get fixed unless you dig in and find the problem. (I’m happy to review a PR, but I’m a novice when it comes to the internals of Convex, so I don’t even know where you should start to look )

1 Like

Ah, that’s a pity, I’ve enjoyed using Convex very much. Guess I’d stick to using JuMP for now, and maybe dig a little deeper into Convex when I’m procrastinating on my own thesis . Thanks anyway!

Please do file an issue! As Oscar said, the package is indeed in need of a maintainer, but having issues with open bugs helps a lot (…sometimes I find time to contribute a bit still). This seems like a serious issue (a correctness bug) so it is good it is tracked.

In [23]: Δarr
Out[23]: 3×3×2×2 Array{Float64, 4}:
[:, :, 1, 1] =
1.52194  1.52194  1.44642
1.52194  1.52194  1.44642
1.52194  1.52194  1.44642

[:, :, 2, 1] =
1.52194  1.52194  1.44642
1.52194  1.52194  1.44642
1.52194  1.52194  1.44642

[:, :, 1, 2] =
1.52194  1.52194  1.44633
1.52194  1.52194  1.44633
1.52194  1.52194  1.44633

[:, :, 2, 2] =
1.52194  1.52194  1.44633
1.52194  1.52194  1.44633
1.52194  1.52194  1.44633

I’ll admit I haven’t read closely enough to understand if this is good or not .

Submitted a bug report with a simpler optimisation that still exhibits the behaviour

Oddly enough, I checked out the eph/moi2-again branch and got an error when trying to take a tensor product:

julia> kron(I(2),Semidefinite(2))
ERROR: MethodError: no method matching Matrix{Bool}(::Vector{Bool})
Closest candidates are:
Array{T, N}(::AbstractArray{S, N}) where {T, N, S} at array.jl:540
Matrix{T}(::LinearAlgebra.LQPackedQ) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/lq.jl:154
Matrix{T}(::LinearAlgebra.HessenbergQ{var"#s814", var"#s8141", var"#s813", false} where {var"#s814", var"#s8141"<:StridedMatrix{var"#s814"}, var"#s813"<:StridedVector{var"#s814"}}) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/hessenberg.jl:467
...
Stacktrace:
[1] convert(#unused#::Type{Matrix{Bool}}, a::Vector{Bool})
@ Base ./array.jl:532
[2] Convex.Constant(x::Vector{Bool}, sign::Positive)
@ Convex ~/Desktop/Convex.jl/src/constant.jl:41
[3] Convex.Constant(x::Vector{Bool}, check_sign::Bool) (repeats 2 times)
@ Convex ~/Desktop/Convex.jl/src/constant.jl:44
[4] constant(x::Bool)
@ Convex ~/Desktop/Convex.jl/src/constant.jl:99
[5] *(x::Bool, y::Variable)
@ Convex ~/Desktop/Convex.jl/src/atoms/affine/multiply_divide.jl:167
[6] kron(a::Diagonal{Bool, Vector{Bool}}, b::Variable)
@ Convex ~/Desktop/Convex.jl/src/atoms/affine/kron.jl:7
[7] top-level scope
@ none:1

While kron(Semidefinite(2),I(2)) worked. Even so, tr(Semidefinite(2)) == 1 gave a similar error, so I wasn’t able to confirm what you found.

2 Likes