Hi JuMP developers,
I am translating a MATLAB/YALMIP semidefinite program to Julia/JuMP + MosekTools. The model comes from a many-body bootstrap calculation. The MATLAB/YALMIP version can solve a small 3 x 3 lattice case, but my direct JuMP translation runs into severe memory issues when solving the full SDP.
I would appreciate advice on whether my JuMP formulation is inefficient, whether I should use MathOptInterface directly, or whether this is an expected solver-side factorization issue.
Disclosure: I prepared this post with AI assistance to make the description precise and reproducible, but the code, measurements, and problem setup are from my local experiments.
Problem summary
The model has:
- complex-valued decision variables,
- sparse affine maps from those variables into large Hermitian PSD constraints,
- two large T2 PSD blocks in the
3 x 3case:- one
81 x 81Hermitian PSD block, - one
162 x 162Hermitian PSD block.
- one
The JuMP model can be constructed successfully. The problem occurs during optimize!, apparently in the MOSEK/MOI bridge/solver stage.
Current modeling approach
I use complex JuMP variables:
using JuMP
import MathOptInterface as MOI
import Dualization
import Mosek
import MosekTools
model = use_dual ?
Model(Dualization.dual_optimizer(Mosek.Optimizer)) :
Model(Mosek.Optimizer)
@variable(model, M_D[1:Nk, 1:Nk, 1:Nk, 1:3] in ComplexPlane())
@variable(model, rho_k_sz[1:Nk, 1:2])
The difficult part is a sparse affine map into the T2 constraints:
D_vec = vec(M_D)
rho_var_vec = vec(rho_k_sz)
T2_vec = affine_map(A, D_vec) .+
affine_map(B, rho_var_vec) .+
extra
M_T2 = reshape(T2_vec, Nk^2, Nk^2, 1, 5)
For the 3 x 3 case, Nk = 9, so:
A size = (32805, 2187), nnz = 7481
B size = (32805, 18), nnz = 986
length(T2_vec) = 32805
The two T2 PSD constraints are:
# block 1: 81 x 81 Hermitian PSD
M_T2[:, :, 1, 1] + M_T2[:, :, 1, 1]'
# block 2: 162 x 162 Hermitian PSD
[
M_T2[:, :, 1, 2] M_T2[:, :, 1, 3]
M_T2[:, :, 1, 4] M_T2[:, :, 1, 5]
]
Initially I used:
@constraint(model, LinearAlgebra.Hermitian(H) in HermitianPSDCone())
Then I rewrote this to avoid constructing the full Hermitian matrix and instead pass the triangular vector directly to MathOptInterface:
@constraint(
model,
terms in MOI.HermitianPositiveSemidefiniteConeTriangle(n),
)
where terms is ordered as required by MOI:
- real parts of the upper triangular entries, column by column;
- imaginary parts of the strict upper triangular entries, column by column.
This avoids an extra full matrix allocation, but the solver-side problem remains large.
What I tried
1. Direct primal formulation
With:
model = Model(Mosek.Optimizer)
the 3 x 3 model builds, but the process is killed by the Linux OOM killer during optimize!.
A build-only diagnostic succeeds:
model, t = bootstrapSC(
3, 3, 2, [0.01, 0.0];
use_T2 = true,
build_only = true,
)
Result:
vars = 4392
constraints = 1290
So the model construction itself completes.
2. Dualization.jl
Using:
model = Model(Dualization.dual_optimizer(Mosek.Optimizer))
improves the situation. The same 3 x 3 model reaches MOSEK and can complete one interior-point iteration if I set:
set_attribute(model, "MSK_IPAR_INTPNT_MAX_ITERATIONS", 1)
The MOSEK log for the dualized model is:
Problem
Objective sense : maximize
Type : CONIC
Constraints : 46998
Scalar variables : 2377
Matrix variables : 83 (scalarized: 84159)
Optimizer
Constraints : 46908
Scalar variables : 2288
Semi-definite variables: 83
scalarized : 84159
Factor
setup time : 61.26
dense det. time : 20.98
nonzeros before factor : 4.35e+08
flops : 7.75e+12
This confirms that dual_optimizer helps avoid immediate OOM, but the factorization is still very large.
Comparison with MATLAB/YALMIP
In MATLAB/YALMIP, the corresponding code is essentially:
M_T2 = reshape( ...
A * reshape(M_D, [], 1) + ...
B * reshape(rho, [], 1) + ...
extra, ...
[Nk^2, Nk^2, 1, num_Sz_T2] ...
);
Constraints = [
Constraints,
M_T2(:,:,1,1) + M_T2(:,:,1,1)' >= 0,
[
M_T2(:,:,1,2), M_T2(:,:,1,3);
M_T2(:,:,1,4), M_T2(:,:,1,5)
] >= 0
];
Here A and B are sparse matrices. YALMIP appears to handle the sparse symbolic affine maps more efficiently for this small case.
Questions
-
Is
Dualization.dual_optimizer(Mosek.Optimizer)the recommended approach for this type of SDP? -
Is there a more memory-efficient way in JuMP to model sparse affine maps into Hermitian PSD cones?
-
Should I bypass JuMP expressions and build an
MOI.VectorAffineFunctionmanually for the PSD cone constraints? -
Would manually reformulating the Hermitian PSD constraints as real symmetric PSD constraints reduce memory or factorization size, or does MOI already do this optimally?
-
Are there best practices for avoiding huge factorization sizes in structured SDP models with a small number of large PSD blocks?
-
Is this likely a JuMP/MOI modeling issue, a bridge issue, or simply the expected cost of this SDP formulation as passed to MOSEK?
I am happy to provide the full reproducible code if useful. Any advice on the preferred JuMP/MOI formulation would be greatly appreciated.