Performance of primal versus dual forms in JuMP

I’m trying to determine the most efficient form of modelling a problem for a package (QuantumNPA.jl). The natural way of writing the problem is in the dual or geometric form (see here for nomenclature); however, this gives terrible performance with Mosek. I can also write the problem in primal or standard form, which Mosek solves nicely, but then setting up the problem takes a long time. I’m flummoxed. I have a MATLAB package to solve the same problem with YALMIP (Moment), and there it’s easy: I write the problem in dual form, and YALMIP gives it to Mosek in primal form. Both setting up the problem and solving it are as fast as it gets.

I’d like to achieve the same result with JuMP; ideally I’d write the problem in dual form, but if that doesn’t work it’s ok, I can also write it in primal form, as long as setting up the problem is fast. I’ve written a simple example to illustrate the problem (only conceptually, the example is too simple for benchmarking to be possible):

using SparseArrays
using JuMP
using LinearAlgebra
using MosekTools

function build_basis()
    d = 5
    F = [sparse([i,j],[j,i],[1.0,1.0],d,d) for j=2:d for i=1:j-1]
    pushfirst!(F,I(d))
    return F
end

function npa_dual()
    F = build_basis()
    d = size(F[1],1)
    nvars = length(F)
    bigF = spzeros(d^2, nvars)
    for i=1:nvars
        bigF[:,i] .= vec(F[i])
    end
    coeffs = zeros(nvars)
    coeffs[[6,7,9,10]] .= [1,1,1,-1]

    model = Model(Mosek.Optimizer)
    @variable(model, x[1:nvars])
    @constraint(model, x[1] == 1)
    
    moment_matrix = Symmetric(reshape(bigF*x,d,d))
    @constraint(model, moment_matrix in PSDCone())

    @objective(model, Max, dot(coeffs,x))

    optimize!(model)
    return objective_value(model)
end

function npa_primal()
    F = build_basis()
    d = size(F[1],1)
    nvars = length(F)
    coeffs = zeros(nvars)
    coeffs[[6,7,9,10]] .= [1,1,1,-1]

    model = Model(Mosek.Optimizer)
    @variable(model, Z[1:d,1:d] in PSDCone())

    for i=2:nvars
        @constraint(model, dot(F[i],Z) + coeffs[i] == 0)
    end

    @objective(model, Min, coeffs[1] + dot(F[1],Z))
    optimize!(model)
    return objective_value(model)
end

In the primal formulation I can see that pretty much the entire time is spent in the line

@constraint(model, dot(F[i],Z) + coeffs[i] == 0)

Is there a smarter way of doing this?

We don’t have any special method for LinearAlgebra.dot(::SparseMatrixCSC, ::Symmetric{<:MA.AbstractMutable}) so it’s using the generic code in SparseArrays that doesn’t exploit the mutability of JuMP affine expression so it is allocating a lot of intermediate expression and hence is very slow.
Try doing the dot product by hand, e.g. with

acc = zero(AffExpr)
add_to_expression!(acc, coeffs[i])
s = F[i]
for j in size(s, 2)
    for k in nzrange(s, j)
        add_to_expression!(acc, nonzeros(s)[k], Q[rowvals(s)[k], j])
    end
end

Amazing, that was it! Thanks a lot. Now the time to set up the problem is barely measurable, as it should, and solving the problem is the relevant bit. For the record, the dot I implemented is

function dot(A::SparseMatrixCSC, B::Symmetric{<:MA.AbstractMutable})
    acc = zero(eltype(B))
    for j in 1:size(A, 2)
        for k in nzrange(A, j)
            add_to_expression!(acc, nonzeros(A)[k], B[rowvals(A)[k], j])
        end
    end
    return acc
end

I’m a bit confused, though, because I had thought about this, and did check where dot(F[1], Z) was going with the @edit macro. It got me inside MutableArithmetics, so I thought great, it is using the proper code.

Another thing: is there any chance of getting Mosek to solve the primal problem when writing down the dual, i.e., to get YALMIP’s behaviour? I mean without using Dualization.jl, as that’s slow.

The recommendation is to use Dualization, it shouldn’t be a bottleneck. If it is then it’s a performance bug that should be fixed. If you have a reproducible example we can investigate.
Actually, for SumOfSquares we implement both the image form and kernel form of SumOfSquares since they are not really dual to each other. These are implemented as two bridges and the form applied depend on which solver you use and whether you use Dualization or not, see LMI bridge · Issue #205 · jump-dev/SumOfSquares.jl · GitHub.
We rewrote SumOfSquares to make this part not specific to polynomials so that QuantumNPA.jl could create a SumOfSquares.WeightedSOSCone and then use the image form and kernel form bridges. It’s still a WIP but I’m going to present it at JuMP-dev in two weeks.

2 Likes

I just tested it carefully, and indeed Dualization worked fine; it didn’t introduce a measurable overhead, and the resulting SDP that Mosek got was the same. I don’t know why I was under the impression it was slow.

I’m interested in watching your talk (I’m quite attached to the monomial basis), is it going to be streamed?

Not streamed but recorded and we’ll upload them to Youtube some time after the workshop.

Thanks for the information, I’ll watch it then.

1 Like