I am trying to formulate (unconstrained form) the Matrix completion problem in JuMP.
I am trying this formulation based on minimizing the square loss and the Frobenius norm, but this formulation always leads to the Zero matrix being the solution since the gradient at this point is always zero when I run any optimization algorithm (Ipopt for example).
My code is as follow (The matrix Omega contains only 1 and 0’s to represent which values in the matrix M are known):
function formulateMatrixCompletionProblem(M::Matrix{Float64}, Ω::Matrix{Int64}, r::Int64, λ::Float64)
model = Model()
m = size(M)[1]
n = size(M)[2]
@variable(model, X[i=1:m, j=1:r])
@variable(model, Y[i=1:n, j=1:r])
#1/2 ∑ [M_{ij} - (XY')_{ij}] ^ 2 + λ * (||X||_F ^ 2 + ||Y||_F ^ 2)
@NLexpression(model, square_loss, 0.5 * (sum(sum(Ω[i, j] * (M[i, j] - (sum(X[i, k] * transpose(Y)[k, j] for k in 1:r))) ^ 2 for j in 1:n) for i in 1:m)))
@NLexpression(model, frobeniusNorm_X, sum(sum(X[i, j] ^ 2 for j in 1:size(X)[2]) for i in 1:size(X)[1]))
@NLexpression(model, frobeniusNorm_Y, sum(sum(Y[i, j] ^ 2 for j in 1:size(Y)[2]) for i in 1:size(Y)[1]))
@NLobjective(model, Min, square_loss + λ * (frobeniusNorm_X + frobeniusNorm_Y))
return model
end
Please note: this issue is when I start with a Zero vector as my starting point, but if I initialize my vector as the singular value decomposition of the matrix M (based on only the values I know), I don’t run into this issue. I want double check if there are other approaches to follow.
To clarify, the problem is that starting from the 0 solution fails to converge with Ipopt? This is expected.
The main assumptions in Ipopt are that the problem is smooth, convex, and twice-differentiable. You have some bilinear terms x[i, k] * y[j, k], so the problem is not convex.
Have you tried removing square_loss as an objective term and doing something like adding it as a set of constraints with a slack term. This should be a somewhat equivalent formulation:
using JuMP, Ipopt
M = rand(3,4)
Ω = map(x -> x > 0.6 ? 1 : 0, M)
r = 2
λ = 0.1
model = Model(Ipopt.Optimizer)
m = size(M)[1]
n = size(M)[2]
F = [(i,j) for i in 1:m for j=1:n if Ω[i, j] > 0]
@variable(model, X[i=1:m, j=1:r])
@variable(model, Y[i=1:n, j=1:r])
@variable(model, s[i=1:m, j=1:n])
@NLexpression(model, frobeniusNorm_X, sum(sum(X[i, j] ^ 2 for j in 1:size(X)[2]) for i in 1:size(X)[1]))
@NLexpression(model, frobeniusNorm_Y, sum(sum(Y[i, j] ^ 2 for j in 1:size(Y)[2]) for i in 1:size(Y)[1]))
@NLconstraint(model,
con_known_entries_bounds[(i,j) in F],
sum(X[i, k] * transpose(Y)[k, j] for k in 1:r) == M[i, j] + s[i,j]
)
@NLobjective(model, Min, sum(s[i,j]^2 for (i,j) in F)^2 + λ * (frobeniusNorm_X + frobeniusNorm_Y))
optimize!(model)
If you look at value.(X), value.(Y) and value.(s), (I hope) you can see why the optimum can choose the zero matrix for X and Y.