Hi! I am trying to solve the following optimization problem where I am trying to estimate the L-matrix of a projection determinantal point process (DPP) which is known to be of form U U^T where U is an M \times N matrix with orthonormal columns. This is done by maximizing the following likelihood:
where Y contains sampled order-N subsets of \{1, \dots, M\} and L_\mathcal{I} is L indexed at indices \mathcal I (L[ℐ, ℐ]
in Julia syntax). AFAIU, this likelihood is non-convex and the problem is also overspecified since U is only determined up to its span.
I tried using JuMP.jl at first, but did not find a way to implement constraints on matrices. Then I tried Nonconvex.jl, but none of the solvers I could find seemed to be able to handle the equality constraints needed to ensure U is orthonormal. I ended up using NLopt.jl directly with the ISRES solver as that seemed to be the only global optimizer able to handle equality constraints.
Here, I am generating artificial samples from a known U at first and then trying to fit another matrix U^\prime to those observations:
using Determinantal, LinearAlgebra, FHist
const N, M = 3, 10
U = qr(randn(M, M)).Q[:, 1:N]
dpp1 = ProjectionEnsemble(U)
function sample_dpp(dpp, n, ::Val{N}) where {N}
hist = Hist3D(; binedges = ntuple(_ -> 0.5:10.5, N))
for _ in 1:n
x = Determinantal.sample(dpp)
atomic_push!(hist, ntuple(i -> x[i], N)...)
end
return hist
end
hist1 = sample_dpp(dpp1, 10^7, Val(N))
const n = bincounts(hist1)
using NLopt, DifferentiationInterface, Mooncake
function objective(U)
L = U * U'
sum(pairs(IndexCartesian(), n)) do (I, n)
iszero(n) && return n
i = [Tuple(I)...]
x, sign = logabsdet(L[i, i])
sign ≤ 0 ? -Inf : n * x
end
end
const objective_prep = prepare_gradient(objective, AutoMooncake(; config = nothing), U)
function constraint(x)
U = reshape(x, M, N)
[dot(U[:, i], U[:, j]) - (i == j) for (i, j) in Iterators.product(1:N, 1:N) if i ≤ j]
end
const constraint_prep = prepare_jacobian(constraint, AutoMooncake(; config = nothing), vec(U))
opt = NLopt.Opt(:GN_ISRES, M * N)
NLopt.lower_bounds!(opt, -ones(M * N))
NLopt.upper_bounds!(opt, ones(M * N))
NLopt.xtol_rel!(opt, 1e-12)
NLopt.max_objective!(opt, function (x::Vector, grad::Vector)
U = reshape(x, M, N)
if length(grad) > 0
return value_and_gradient!(objective, reshape(grad, M, N), objective_prep, AutoMooncake(; config = nothing), U)[1]
else
return objective(U)
end
end)
NLopt.equality_constraint!(opt, function (result::Vector{Float64}, x::Vector{Float64}, grad::Matrix{Float64})
if length(grad) > 0
r, jac = value_and_jacobian(constraint, constraint_prep, AutoMooncake(; config = nothing), x)
grad .= jac'
else
r = constraint(x)
end
result .= r
return
end, fill(1e-12, N * (N + 1) ÷ 2))
min_f, min_x, ret = NLopt.optimize(opt, vec(qr(randn(M, M)).Q[:, 1:N]))
dpp2 = ProjectionEnsemble(reshape(min_x, M, N))
hist2 = sample_dpp(dpp2, 10^7, Val(N))
@show objective(U) min_f
This does work, but is fairly slow and does not make use of any of the provided gradients. (My code also isn’t super optimized yet, I could probably avoid some allocations here, but wanted to get a rough picture first) Although the histograms produced by sampling the two solution are quite similar, there are statistically significant deviations. Comparing min_f
to the theoretically optimal objective(U)
I get -3.3537589223237775e7
and -3.3128113721209113e7
respectively, which is again not bad, but I was hoping to get a little closer to the true optimum. I have played around with setting ftol_rel!
as well, but that did not change the results much.
As I only have limited experience with solving optimization problems, I wanted to ask for suggestions on algorithms or packages I could try that might be better suited for the task. I am also not really sure if this is the best way to formulate my problem, so if you have any other advice I am all ears.