Speeding-up the construction of a second order cone JuMP model

Hi everyone. I need to solve tens of thousands of second order cone problems, so I need my code to be as fast as possible. Currently, my main bottleneck is actually building the problem in JuMP and I would appreciate any help with speeding it up.

So suppose I have some list of matrices E = [ E_i ]_{i=1}^{10^n} for any given n and that, for some fixed \epsilon, I want to compute

\min_{\eta, X} \eta : \lVert [ \text{tr}(E_i X) ]_i \rVert_2 \leq \epsilon \eta ,

for some positive semidefinite X. What I mean by [ \text{tr}(E_i X) ]_i is the 10^n-dimensional vector obtained by multiplying each E_i with the matrix variable X and taking the trace, while \lVert \cdot \rVert_2 is the Frobenius norm.

Here is how I am currently implementing this:

function f(ϵ, E)
    n = size(E)[1]
    model = Model(SCS.Optimizer)

    X = @variable(model, [1:n,1:n] in PSDCone())
    @variable(model, η >= 0)

    # Construct the vector of traces:
    idx = 1
    s = Vector{AffExpr}(undef, 10^n)
    for e in E
        @inbounds s[idx] = tr(e * X)
        idx += 1
    end
    # Frobenius norm of s is leq. ϵη:
    @constraint(model, [ϵ * η; s] in SecondOrderCone())

    @objective(model, Min, η)
    optimize!(model)
end

However, this implementation is too slow. Already for n \geq 5 it can take several minutes to build the model. Also, I get JuMP warnings about too many uses of the addition operator.

So how could I speedup this implementation? And why am I getting JuMP warnings? Is it because of the tr method not operating in-place? Can that also be improved?


Equivalently, one can write the problem as

\min_{\eta, X} \eta : \sum_i \text{tr}(E_i X)^2 \leq \sqrt{\epsilon} \eta ,

then \sqrt{\eta^{*}} is equal to the optimal solution of the first formulation. A possible implementation is:

function g(ϵ, E)
    n = size(E)[1]
    model = Model(SCS.Optimizer)

    X = @variable(model, [1:n,1:n] in PSDCone())
    @variable(model, η >= 0)

    s = QuadExpr()
    for e in E
        tmp = tr(e * X)
        add_to_expression!(s, tmp, tmp)
    end
    @constraint(model, s <= η * ϵ^2)

    @objective(model, Min, η)
    optimize!(model)

This leads to a smaller problem for the solver, but model building in my implementation is even slower than the first attempt.

So any hints on how to make this one faster are also welcome :slight_smile:


Note: Since the vector E of matrices can be very large, in my actual code it is implemented as a generator. This must be kept that way. Therefore, any suggestions that depend on collecting E would not work.

I think in your case modifying your model in-place instead of rebuilding it every time would yield substantial gains

1 Like

Also, I get JuMP warnings about too many uses of the addition operator.

It’s hard to know without a reproducible example (what are ϵ and E?), but this should be better:

function f(ϵ, E)
    n = size(E, 1)
    model = Model(SCS.Optimizer)
    @variable(model, X[1:n, 1:n] in PSDCone())
    @variable(model, η >= 0)
    s = Vector{AffExpr}(undef, 10^n)
    for (i, e) in enumerate(E)
        s[i] = @expression(model, LinearAlgebra.tr(e * X))
    end
    @constraint(model, [η; s] in SecondOrderCone())
    @objective(model, Min, η / ϵ)
    optimize!(model)
end
1 Like

Thanks! Here is a working example comparing the original implementations and the suggestion:

using JuMP
using SCS
using LinearAlgebra

function genE(n::Int)
    # Details are unimportant, assume given as black box.
    basis = [rand(2,2) for _ = 1:10]
    prods = Iterators.product(Iterators.repeated(1:10, n)...)
    (kron((basis[e] for e in el)...) for el in prods)
end

function f(ϵ::Float64, n::Int)
    E = genE(n)
    model = Model(SCS.Optimizer)
    set_silent(model)

    X = @variable(model, [1:2^n,1:2^n] in PSDCone())
    @variable(model, η >= 0)

    # Construct the vector of traces:
    s = Vector{AffExpr}(undef, 10^n)
    @time for (i,e) in enumerate(E)
        @inbounds s[i] = tr(e * X)
    end
    # Frobenius norm of s is leq. ϵη:
    @constraint(model, [ϵ * η; s] in SecondOrderCone())

    @objective(model, Min, η)
    @time optimize!(model)
end

function fnew(ϵ::Float64, n::Int)
    E = genE(n)
    model = Model(SCS.Optimizer)
    set_silent(model)

    @variable(model, X[1:2^n, 1:2^n] in PSDCone())
    @variable(model, η >= 0)
    s = Vector{AffExpr}(undef, 10^n)
    for (i, e) in enumerate(E)
        s[i] = @expression(model, LinearAlgebra.tr(e * X))
    end
    @constraint(model, [η; s] in SecondOrderCone())
    @objective(model, Min, η / ϵ)
    optimize!(model)
end

function g(ϵ::Float64, n::Int)
    E = genE(n)
    model = Model(SCS.Optimizer)
    set_silent(model)

    X = @variable(model, [1:2^n,1:2^n] in PSDCone())
    @variable(model, η >= 0)

    s = QuadExpr()
    for e in E
        tmp = tr(e * X)
        add_to_expression!(s, tmp, tmp)
    end
    @constraint(model, s <= η * ϵ^2)

    @objective(model, Min, η)
    optimize!(model)
end

Here are some running times where the new version of f seems to perform slightly better in larger instances:

@btime f(2.0, 2)
@btime f(2.0, 3)
@btime f(2.0, 4)
  2.050 ms (20875 allocations: 2.05 MiB)
  70.313 ms (595480 allocations: 76.23 MiB)
  6.591 s (29902520 allocations: 6.94 GiB)

@btime fnew(2.0, 2)
@btime fnew(2.0, 3)
@btime fnew(2.0, 4)
  2.092 ms (21949 allocations: 2.22 MiB)
  69.706 ms (606794 allocations: 78.20 MiB)
  5.832 s (30043797 allocations: 7.01 GiB)

@btime g(2.0, 2)
@btime g(2.0, 3)
@btime g(2.0, 4)
  2.164 ms (20471 allocations: 1.85 MiB)
  95.829 ms (586395 allocations: 70.64 MiB)
  12.979 s (29773494 allocations: 6.73 GiB)

Here are the times for the main loop in each function and the total solving time:

@time f(2.0, 4)
  LOOP:     5.817728 seconds (29.76 M allocations: 6.718 GiB, 32.05% gc time)
  SOLVER:   1.081494 seconds (1.12 k allocations: 93.828 MiB, 26.02% gc time)
  TOTAL:    7.094979 seconds (29.90 M allocations: 6.941 GiB, 30.59% gc time)

@time fnew(2.0, 4)
  LOOP:     4.801211 seconds (29.90 M allocations: 6.788 GiB, 17.66% gc time)
  SOLVER:   1.059976 seconds (1.12 k allocations: 93.827 MiB, 25.12% gc time)
  TOTAL:    6.047453 seconds (30.04 M allocations: 7.011 GiB, 18.79% gc time)

@time g(2.0, 4)
  LOOP:     13.156862 seconds (29.77 M allocations: 6.721 GiB, 6.01% gc time)
  SOLVER:   0.007727 seconds (1.22 k allocations: 6.449 MiB)
  TOTAL:    13.167753 seconds (29.77 M allocations: 6.730 GiB, 6.00% gc time)

So it seems g is easy for the solver but takes a long time to build, while f is the opposite. I would be glad to get any pointers on how to further improve any of these implementations.

Thanks, I had not considered it before. For the actual model (which is fairly more complicated), turns out it is tricky to do this, but it might help when I get it right.

The best I found was

using JuMP
import SCS
import LinearAlgebra

function genE(n::Int)
    basis = [rand(2,2) for _ = 1:10]
    prods = Iterators.product(Iterators.repeated(1:10, n)...)
    return (LinearAlgebra.kron((basis[e] for e in el)...) for el in prods)
end

function f(ϵ::Float64, n::Int)
    E = genE(n)
    model = Model(SCS.Optimizer)
    set_silent(model)
    @variable(model, X[1:2^n, 1:2^n], PSD)
    @variable(model, η >= 0)
    args = Vector{AffExpr}(undef, 1+10^n)
    args[1] = η
    for (i, e) in enumerate(E)
        # args[i+1] = @expression(model, LinearAlgebra.tr(e * X))
        args[i+1] = @expression(model, sum(e[j, :]' * X[:, j] for j in 1:2^n))
    end
    @constraint(model, args in SecondOrderCone())
    @objective(model, Min, η / ϵ)
    optimize!(model)
    return objective_value(model)
end

@time f(2.0, 4)  # ~2.3 seconds

You want to avoid actually constructing the 10,000 matrices of e * X, only to use the diagonal elements.

1 Like

Nice, found it to be several times faster in the larger cases I was interested. Thanks!

1 Like