Performance of optimization problem in a loop

Hello. I’m trying to improve the performance of an optimization problem. I have a Hermitian matrix and want to minimize the diagonal terms while keeping the matrix positive semidefinite.

Given a matrix S,

\text{minimize}\sum d\\ \text{subject to S}+diag(d) \geq 0

where d is a vector variable. This is solved using Convex.jl and COSMO.jl:

using Convex, COSMO
n = 4
x = rand(ComplexF64,n); 
noise = diagm(rand(n))
S = x*x' + noise

d = Convex.Variable(n)
p = minimize(sum(d), S+Diagonal(d) in :SDP)
solve!(p, () -> COSMO.Optimizer())

Sx = S.+diagm(d.value[:]) # Solution

diag(Sx) + diag(noise) ≈ diag(S) # approx true

First question, is this an optimal formulation or should it be reformulated?

Second question, my real problem is larger n=80 and needs to be solved 1000s of times since my matrix S is in fact 3D. Setting up a loop:

N,N,M = size(S) # (80,80,1000)
d = Convex.Variable(N)
A = similar(S)
for i = 1:M
    p = minimize(sum(d), S[:,:,i]+Diagonal(d) in :SDP)
    solve!(p, () -> COSMO.Optimizer())
    A[:,:,i] = Hermitian(S[:,:,i].+diagm(d.value[:]))
end

How can I avoid to setup the problem at each iteration? Warm-starting seems like to perfect solution but I’m not sure how to do that when my constraints change at each iteration.

Thanks!

How can I avoid to setup the problem at each iteration? Warm-starting seems like to perfect solution but I’m not sure how to do that when my constraints change at each iteration.

Unfortunately, Convex.jl does not have a good answer to this right now.

One thing you can do is use fix!'d variables:

N,N,M = size(S) # (80,80,1000)
d = Convex.Variable(N)
A = similar(S)
B = Variable(size(S)[1:2])
p = minimize(sum(d), B+Diagonal(d) in :SDP)

for i = 1:M
    fix!(B, S[:, :, i])
    solve!(p, () -> COSMO.Optimizer())
    A[:,:,i] = Hermitian(S[:,:,i].+diagm(d.value[:]))
end

(untested, but should work). However, I’m not sure it will actually save you much time, because on every solve! call, Convex recalculates its extended formulations. (Let me know if it does help though!). Ideally, these could be reused, but it’s not implemented (Cache conic forms between `solve!`s? · Issue #318 · jump-dev/Convex.jl · GitHub) (and also might need a lot of thought to implement correctly).

Could you instead parameterize the Cholesky factor directly? That way your constraint is just a box constraint on the diagonal of the factor (for uniqueness). Getting the diagonal of a matrix from its Cholesky factor requires a little bit of computation, but I wouldn’t guess that it would be a bottleneck.

EDIT: read the problem wrong, sorry. I don’t think this is a helpful suggestion.

Thanks for the clarification! Looking into the timings - i does seem to improve but need to check if the two actually finds the same result. Will update…

Interesting suggestion. Could you elaborate a bit, I’m not quite sure how to do what you propose. Thanks!

Looks like a 2X speed-up. Do get a few Problem status INFEASIBLE with the fix! not sure why but overall it works. Thanks!

Nice! Glad to hear there was a benefit. Convex’s code should treat fix!'d variable exactly the same as if you put the matrix there instead, so I’d suspect the INFEASIBLE status is unrelated, but there could be a bug or something else strange going on. If you can get a reproducible example where there is a difference (ideally on a small/simple problem), please file an issue so I can look into it in more detail (https://github.com/jump-dev/Convex.jl/issues/new).

1 Like

Apologies! I read the problem too quickly and thought you were trying to find a matrix that minimized something. I’ve edited my above comment to say the same.

1 Like

Inspired by this post Near positive definite (NearPD) of a symmetric matrix in Julia - #2 by lostella I was wondering if ProximalAlgorithms.jl could enable warm-starting for this particular problem. @lostella if you don’t mind me asking, could you suggest strategies to solve this?

Not to sidetrack this, but you can try warmstarting in Convex with solve!(problem, solver; warmstart=true) by the way.

Not sidetracking at all. Should have mentioned that I did try that without any benefit (to my particular problem)

1 Like

Hi, I am one of the developers of COSMO.jl. It seems like you could make the model creation very fast because you are just changing one side of the PSD-constraint:
Internally in COSMO it will look like this:

- diag(d) + X = S, X \succeq 0

and just S changes. This means you can recycle a lot of the steps of the algorithm, i.e. just setup the problem once and change that part of the constraint.
However, the dominating cost of the computation is to repeatedly compute the eigendecomposition of X at each iteration. You won’t get around this so warm-starting your initial guess of d could help to decrease the number of times you have to do that.

Furthermore, your S is dense, right? If S would be sparse, COSMO could get some further speed-ups by decomposing the PSD constraint.

One issue is that at the moment each model change in Convex.jl or JuMP triggers a completely new model copy that is passed to COSMO. I haven’t gotten around to support incremental model changes yet.

If you are sure your problems are always feasible, you can lower the infeasibility check tolerance in the solver settings with eps_prim_inf = 1e-8 and eps_dual_inf = 1e-8.

2 Likes

Thank you for the suggestions. Yes, my matrix is dense and changing tolerances is indeed an easy way to speed of things. Will need to experiment more with this.

Seeing how COSMO treats the problem, it does look obvious that a lot can be reused. Is your suggestion not similar to the fix! in the Convex code above? Looking at the COSMO examples, e.g., https://github.com/oxfordcontrol/COSMO.jl/blob/master/examples/closest_correlation_matrix.jl
I see that the example is using JuMP, do you think that switching to that with a @constraint in the loop could make a significant change? Will need to test it out…

I think all implemented methods in ProximalAlgorithms require the initial iterate to be provided, see for example here. Whether the (approximate) solution to each of your problems is a good starting point for the next one is hard to say, of course.

If I’m not wrong the dual problem to yours should be solvable via proximal gradient or Douglas-Rachford splitting? With any of these the costly operations will definitely be the SVD computation, you might want to give some of the quasi-Newton variants of these a try (see this and this).

I’m sorry the package doesn’t yet have documentation, I would really need to duplicate myself to take care of that!

Edit: fixed link to quasi-Newton Douglas-Rachford.

1 Like

I think it’s worth checking where you spent most of the time. If the eigenvalue decomposition of your matrix iterate takes >80% of the time, you probably won’t be able to achieve significant improvements by assembling the problem more efficiently.

I don’t think switching to JuMP would make a difference as both JuMP and Convex pass a completely new model to COSMO once you change the model. I am also not sure if JuMP can handle complex variables.