Nuclear norm minimization objective using JuMP

I am new to JuMP and trying to solve the nuclear norm minimization problem for low-rank matrix completion:

\text{min}_\mathbf{X} || \mathbf{X} ||_* \ \text{ subject to }\ x_{ij} = m_{ij},\ \ (i, j) \in \Omega

Where \Omega is the set of observed entries.

I am having trouble writing and registering the objective function in a way compatible with JuMP, and I haven’t been able to find a similar example in the tutorials or documentation. (I am using Julia 1.10.3 with JuMP v1.23.5.)

So far I have tried the following (and variations around it), but they all return a similar error:

# Create example data for the problem
using LinearAlgebra
using Random
rng = MersenneTwister(1234)
n = 20
r = 2
Y = randn(rng,n,r) * randn(rng,r,n)
X = Y .* (rand(rng, n, n) .< 1/2)
mask = X .!= 0

# Set-up the optimization problem
using JuMP, ProxSDP
model = Model(ProxSDP.Optimizer)
@variable(model, X[1:n, 1:n])
@constraint(model, X[mask] .== B[mask])

# 1st attempt:
nuclearnorm(A) = sum(LinearAlgebra.svdvals(A))
@operator(model, op_nuclearnorm, 1, nuclearnorm)
@objective(model, Min, nuclearnorm(X))
# ERROR: MethodError: no method matching svdvals!(::Matrix{NonlinearExpr})

# 2nd attempt
nuclearnorm(A) = sum(LinearAlgebra.diag(sqrt(A' * A)))
@operator(model, op_nuclearnorm, 1, nuclearnorm)
@objective(model, Min, nuclearnorm(X))
# ERROR: MethodError: no method matching sqrt(::Matrix{QuadExpr})

# 3rd attempt
f(x...) = sum(LinearAlgebra.svdvals(reshape(collect(x), n, n)))
@operator(model, myf, n*n, f)
@objective(model, Min, f(X...))
# ERROR: MethodError: no method matching svdvals!(::Matrix{NonlinearExpr})

I am guessing I need to implement a sparse version of the objective, but I am not sure how.

I have read in answers to other threads that JuMP is not really designed for structured matrix inputs; does this mean I should be using a different package altogether, or is there a way to formulate it?

(For context, in Python, the following works:

import numpy as np
import cvxpy as cp
def nuclear_norm_minimization(A, mask):
    X = cp.Variable(A.shape)
    objective = cp.Minimize(cp.norm(X, "nuc"))
    constraints = [X[mask] == A[mask]]
    problem = cp.Problem(objective, constraints)
    problem.solve()
    return X.value

But I would prefer to stay in Julia, and trying to see if a convenient solution exists on this side.)

Hi @MLL, welcome to JuMP.

First, here’s how you can solve your model with JuMP. (Note that I made up B, because it wasn’t in your example code.)

julia> using JuMP

julia> import Random

julia> import SCS

julia> begin
           rng = Random.MersenneTwister(1234)
           n, r = 20, 2
           Y = randn(rng, n, r) * randn(rng, r, n)
           X = Y .* (rand(rng, n, n) .< 1 / 2)
           mask = X .!= 0
           B = rand(rng, n, n)
           model = Model(SCS.Optimizer)
           @variable(model, X[1:n, 1:n])
           @constraint(model, X[mask] .== B[mask])
           @variable(model, t)
           @constraint(model, [t; vec(X)] in MOI.NormNuclearCone(n, n))
           @objective(model, Min, t)
           optimize!(model)
       end
------------------------------------------------------------------
	       SCS v3.2.6 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 821, constraints m: 1006
cones: 	  z: primal zero / dual free vars: 185
	  l: linear vars: 1
	  s: psd vars: 820, ssize: 1
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
	  alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
	  max_iters: 100000, normalize: 1, rho_x: 1.00e-06
	  acceleration_lookback: 10, acceleration_interval: 10
	  compiled with openmp parallelization enabled
lin-sys:  sparse-direct-amd-qdldl
	  nnz(A): 1046, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 7.78e+01  1.00e+00  3.78e+02 -1.72e+02  1.00e-01  1.76e-03 
   125| 3.73e-05  1.47e-05  3.48e-07  2.03e+01  6.29e-01  2.06e-02 
------------------------------------------------------------------
status:  solved
timings: total: 2.06e-02s = setup: 4.78e-04s + solve: 2.02e-02s
	 lin-sys: 1.85e-03s, cones: 1.63e-02s, accel: 8.10e-05s
------------------------------------------------------------------
objective = 20.349979
------------------------------------------------------------------

I note that we don’t have this example in our documentation, so it’s not surprising that you didn’t find it! I’ll fix by adding a new section to Modeling with cones · JuMP

Edit: see [docs] add Norm{Nuclear,Spectral}Cone examples by odow · Pull Request #3887 · jump-dev/JuMP.jl · GitHub

Your other attempts didn’t work for a number of reasons:

You need to use op_nuclearnorm in @objective, not the function nuclearnorm.

But operators take splatted scalar arguments, so you cannot pass a matrix X.

The result would be a nonlinear program, so you couldn’t use ProxSDP, you’d need a solver like Ipopt.

All together, it’d be something like this:

using JuMP, Ipopt
model = Model(Ipopt.Optimizer)
@variable(model, X[1:n, 1:n])
nuclearnorm(x...) = sum(LinearAlgebra.svdvals(reshape(collect(x), n, n)))
@operator(model, op_nuclearnorm, n^2, nuclearnorm)
@objective(model, Min, op_nuclearnorm(X...))

except this would still be broken, because we use ForwardDiff.jl to automatically compute the Jacobian of nuclearnorm, and ForwardDiff.jl does not support differentiating through svdvals.

So all in all, it’s just a lot easier to use MOI.NormNuclearCone!

2 Likes

Also, since you used cvxpy, you might be interested in Convex.jl:

julia> using Convex

julia> import Random

julia> import SCS

julia> begin
           rng = Random.MersenneTwister(1234)
           n, r = 20, 2
           Y = randn(rng, n, r) * randn(rng, r, n)
           X = Y .* (rand(rng, n, n) .< 1 / 2)
           mask = X .!= 0
           B = rand(rng, n, n)
           X = Variable(n, n)
           problem = minimize(nuclearnorm(X), X[mask] .== B[mask])
           solve!(problem, SCS.Optimizer)
           X.value
       end
[ Info: [Convex.jl] Compilation finished: 0.01 seconds, 2.660 MiB of memory allocated
------------------------------------------------------------------
	       SCS v3.2.6 - Splitting Conic Solver
	(c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 821, constraints m: 1006
cones: 	  z: primal zero / dual free vars: 185
	  l: linear vars: 1
	  s: psd vars: 820, ssize: 1
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-07
	  alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
	  max_iters: 100000, normalize: 1, rho_x: 1.00e-06
	  acceleration_lookback: 10, acceleration_interval: 10
	  compiled with openmp parallelization enabled
lin-sys:  sparse-direct-amd-qdldl
	  nnz(A): 1046, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 7.78e+01  1.00e+00  3.78e+02 -1.72e+02  1.00e-01  1.68e-03 
   125| 3.73e-05  1.47e-05  3.48e-07  2.03e+01  6.29e-01  3.68e-02 
------------------------------------------------------------------
status:  solved
timings: total: 3.68e-02s = setup: 1.16e-03s + solve: 3.56e-02s
	 lin-sys: 3.47e-03s, cones: 3.05e-02s, accel: 1.43e-04s
------------------------------------------------------------------
objective = 20.349979
------------------------------------------------------------------
20×20 Matrix{Float64}:
 0.557152    0.920156  0.589795   0.79987   0.70956    …  0.914598   0.772662   0.620617     0.498412
 0.209943    0.403376  0.395392   0.209961  0.284305      0.503401   0.290241   0.401812     0.333887
 0.829296    0.306834  0.431495   0.890567  0.371786      0.552276   0.166319   0.0873135   -0.146714
 0.0519182   0.296777  0.182403   0.128044  0.168737      0.141098   0.552669   0.0433472    0.202201
 0.171119    0.222723  0.19481    0.193909  0.0705132     0.841253   0.01535    0.674796     0.0996287
 0.950475    0.618636  0.936706   0.959606  0.765998   …  0.852024   0.539721   0.408937     0.301514
 0.49767     0.546569  0.933637   0.362951  0.826305      0.425226   0.388858   0.480213     0.320842
 0.0931666   0.249916  0.250004   0.374395  0.365294      0.611173   0.598031   0.590054     0.81137
 0.424858    0.387732  0.203816   0.7124    0.373468      0.349959   0.604177   0.0673362    0.201775
 0.493582    0.752531  0.568588   0.518546  0.675288      0.504677   0.678005   0.381614     0.147359
 0.384668    0.309879  0.633292   0.467537  0.485769   …  0.708239   0.239181   0.54793      0.263735
 0.198       0.329899  0.0048924  0.406721  0.133857      0.630309   0.473034   0.229758     0.423605
 0.50355     0.507337  0.579415   0.595257  0.843093      0.925976   0.269      0.977187     0.323974
 0.218986    0.463496  0.625497   0.322904  0.328661      0.0821005  0.379687  -0.00519354   0.472617
 0.499592    0.448429  0.454953   0.77382   0.470216      0.750303   0.574043   0.328166     0.587934
 0.396173    0.704925  0.697954   0.270462  0.524919   …  0.967519   0.293556   0.901651     0.330099
 0.00582306  0.344542  0.15032    0.510433  0.0986667     0.571481   0.509482   0.453357     0.822917
 0.428621    0.390525  0.447615   0.513719  0.641045      0.375063   0.630625   0.34314      0.49605
 0.0504247   0.359147  0.191784   0.246822  0.302454      0.313561   0.400431   0.0971127    0.60808
 0.358226    0.43062   0.507477   0.479273  0.569764      0.614246   0.561656   0.496115     0.255054

This works perfectly, thank you! (and faster than a PyCall solution!)

I did also try the Convex solution you mentioned, and notice that for large matrices it seems to have a huge memory footprint compared to the JuMP implementation, and takes about twice the time, so I’ll stick with JuMP for now.

Thanks again!

2 Likes

This would be a good subject for a JuMP tutorial.

For anyone looking into this, a key reference is the paper
Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization

The syntax is now in the documentation:

Perhaps we could add the matrix completion example to

1 Like

I would suggest trying Hypatia instead of SCS. Hypatia has a specialized nuclear norm cone, and in my experience it is much faster than the SDP reformulation that is otherwise necessary.