Fitting a projection DPP by maximizing a sum of log determinants

Hi! I am trying to solve the following optimization problem where I am trying to estimate the L-matrix of a projection determinantal point process (DPP) which is known to be of form U U^T where U is an M \times N matrix with orthonormal columns. This is done by maximizing the following likelihood:

f(L) = \sum_{\mathcal I \in Y} \operatorname{logdet} L_\mathcal{I},

where Y contains sampled order-N subsets of \{1, \dots, M\} and L_\mathcal{I} is L indexed at indices \mathcal I (L[ℐ, ℐ] in Julia syntax). AFAIU, this likelihood is non-convex and the problem is also overspecified since U is only determined up to its span.

I tried using JuMP.jl at first, but did not find a way to implement constraints on matrices. Then I tried Nonconvex.jl, but none of the solvers I could find seemed to be able to handle the equality constraints needed to ensure U is orthonormal. I ended up using NLopt.jl directly with the ISRES solver as that seemed to be the only global optimizer able to handle equality constraints.

Here, I am generating artificial samples from a known U at first and then trying to fit another matrix U^\prime to those observations:

using Determinantal, LinearAlgebra, FHist

const N, M = 3, 10
U = qr(randn(M, M)).Q[:, 1:N]
dpp1 = ProjectionEnsemble(U)

function sample_dpp(dpp, n, ::Val{N}) where {N}
    hist = Hist3D(; binedges = ntuple(_ -> 0.5:10.5, N))
    for _ in 1:n
        x = Determinantal.sample(dpp)
        atomic_push!(hist, ntuple(i -> x[i], N)...)
    end
    return hist
end
hist1 = sample_dpp(dpp1, 10^7, Val(N))
const n = bincounts(hist1)


using NLopt, DifferentiationInterface, Mooncake

function objective(U)
    L = U * U'
    sum(pairs(IndexCartesian(), n)) do (I, n)
        iszero(n) && return n
        i = [Tuple(I)...]
        x, sign = logabsdet(L[i, i])
        sign ≤ 0 ? -Inf : n * x
    end
end
const objective_prep = prepare_gradient(objective, AutoMooncake(; config = nothing), U)
function constraint(x)
    U = reshape(x, M, N)
    [dot(U[:, i], U[:, j]) - (i == j) for (i, j) in Iterators.product(1:N, 1:N) if i ≤ j]
end
const constraint_prep = prepare_jacobian(constraint, AutoMooncake(; config = nothing), vec(U))

opt = NLopt.Opt(:GN_ISRES, M * N)
NLopt.lower_bounds!(opt, -ones(M * N))
NLopt.upper_bounds!(opt, ones(M * N))
NLopt.xtol_rel!(opt, 1e-12)
NLopt.max_objective!(opt, function (x::Vector, grad::Vector)
    U = reshape(x, M, N)
    if length(grad) > 0
        return value_and_gradient!(objective, reshape(grad, M, N), objective_prep, AutoMooncake(; config = nothing), U)[1]
    else
        return objective(U)
    end
end)
NLopt.equality_constraint!(opt, function (result::Vector{Float64}, x::Vector{Float64}, grad::Matrix{Float64})
    if length(grad) > 0
        r, jac = value_and_jacobian(constraint, constraint_prep, AutoMooncake(; config = nothing), x)
        grad .= jac'
    else
        r = constraint(x)
    end
    result .= r
    return
end, fill(1e-12, N * (N + 1) ÷ 2))
min_f, min_x, ret = NLopt.optimize(opt, vec(qr(randn(M, M)).Q[:, 1:N]))

dpp2 = ProjectionEnsemble(reshape(min_x, M, N))
hist2 = sample_dpp(dpp2, 10^7, Val(N))

@show objective(U) min_f

This does work, but is fairly slow and does not make use of any of the provided gradients. (My code also isn’t super optimized yet, I could probably avoid some allocations here, but wanted to get a rough picture first) Although the histograms produced by sampling the two solution are quite similar, there are statistically significant deviations. Comparing min_f to the theoretically optimal objective(U) I get -3.3537589223237775e7 and -3.3128113721209113e7 respectively, which is again not bad, but I was hoping to get a little closer to the true optimum. I have played around with setting ftol_rel! as well, but that did not change the results much.

As I only have limited experience with solving optimization problems, I wanted to ask for suggestions on algorithms or packages I could try that might be better suited for the task. I am also not really sure if this is the best way to formulate my problem, so if you have any other advice I am all ears.

What are the constraints you have on U? Or the only thing that matters is that L is a rank-M projector?

It’s easy to relax this problem to maximizing a positive semidefinite matrix L of trace M; it is then convex, and you can even make it conic by using a solver that supports the logdet cone.

Now enforcing L to be a projector is quite hard. You can reformulate the problem as an infinite hierarchy using the techniques in [2012.00554] Quantum-Inspired Hierarchy for Rank-Constrained Optimization

1 Like

Thanks for the quick response! Yes, the only constraint I wanted to enforce is that L is a rank-N projector. The paper you linked looks interesting but is probably a little deep for me. Do you know if anything like this has been implemented in Julia already?

I’m not aware of any publicly available implementation, sorry. The difficult part of it is the DPS hierarchy, though, which I have implemented in my own package Ket.jl.

1 Like

Thanks for the link! I might eventually look into that, but I don’t think I can justify implementing my own specialized optimizer at the moment. Perhaps there is some more low-hanging fruit for my constrained non-convex approach though? I think I will focus on eliminating allocations first and then might explore further solver options. Are there non-problem-specific solvers better suited for tasks like this?

You wouldn’t be implementing an optimizer, you’d be using one that supports the logdet cone (like Hypatia). But yes, it is a lot of work that I would also avoid if I could.

In general rank-constrained optimization is a well-studied problem, and plenty of heuristics for it exist. Perhaps one of the ones discussed here will be useful to you.

1 Like