How to solve a large quadratic programming problem

aris · March 5, 2024, 3:20pm

Hello everyone,

I was looking for the most appropriate way to solve the following minimization problem in Julia.

\arg\min ||g-K f||^2+α||f||^2\equiv \arg\min \frac{1}{2}f^T(K^T K+αI)f-g^TKf
s.t. f\geq0

(where K,I are matrices, g,f are vectors and α is a scalar)

It appears there is a package out there which does just that, Quadprog.jl (https://juliapackages.com/p/quadprog) but currently I can’t download it from Pkg, so I reckon it’s not supported anymore.
Another package focused on this exact type of problems does exist (RegularizationTools.jl), but unfortunately its performance is poor for large matrices like the ones I’m dealing with (K: 300x10000 for example).

I tried using JuMP with open source solvers such as HiGHS and Ipopt, using the following lines of code:

using JuMP, Ipopt

model = Model(Ipopt.Optimizer)
set_attribute(model, "time_limit", 60.0)
@variable(model, f[1:size(K)[2]])
@constraint(model, f .>= 0)
@objective(model, Min, (1/2)*f'*(K'*K + α*I)*f - g*K*f)
optimize!(model)

but it just hits the time limit, and also takes ages to compile.
This makes me believe I must be doing something wrong.

Any insights into this?
Please excuse me if that’s a trivial question, I’m very much new to optimization problems.

P.S. The paper where I found this example is citing this one as the source of the method they used for solving the minimization, followed by this explanation:

For a given \alpha, the QPP is transformed with f=K^Tc, where c is a fitting vector given by c=\frac{f-Kf}{\alpha} and non-negativity is enforced by f=max(0,K^Tc). The value of c is determined by unconstrained minimization.

Perhaps this method is already implemented by some Julia package?

lrnv · March 5, 2024, 3:44pm

Havez you tried with Convex.jl ? It supports a few backend solver and once your problem is setup you can choose between them the one that is the faster. If your matrix K is sparse, some of them might be able to take advantage of it.

zdenek_hurak · March 5, 2024, 4:01pm

I would not recommend a general nonlinear programming solver such as Ipopt for this job. Besides searching for solvers for QP (such as OSQP and its Julia wrapper OSQP.jl), wou may want to consider solvers for conic problems, because these contain OP as a subproblem. For example COSMO.jl.

gdalle · March 5, 2024, 4:29pm

Note that in general JuMP probably remains the right interface for you, in the sense that you should use it to model your problem and call the underlying solver (whichever one you choose).

Regarding the choice of solver, you can check out the list

and start with the ones that support “QP”. Although I’m a bit scared at the size of your matrix, and I also hope it is sparse?

If you can provide a full runnable example (with the true parameters you use) this will be a great help.

aris · March 5, 2024, 5:09pm

Thank you for your replies all. I’m happy to test the different solvers you mentioned and provide benchmarks.

@gdalle Looking at the documentation in previous suggestions, I thought the same, JuMP seems like the most convenient option, as it supports OSQP and COSMO, (but not Convex.jl, I’ll have to go through its documentation to figure out how to use it).

My only concern is, what would be the best way to express my problem in JuMP’s language?
Should I just write down the equation as in the OP?
It seems that doing it like this converts the problem to an expression like the following:

0.5000000164892263 f[1]² + 2.3627519043796735e-7 f[2]*f[1] + 1.1487017830384546e-6 f[3]*f[1] + 4.090326156802839e-6 f[4]*f[1] + 1.1342284303483471e-5 f[5]*f[1] + 2.5728925031670364e-5 f[6]*f[1] + 4.967027079759235e-5 f[7]*f[1] + 8.42400000379402e-5 f[8]*f[1] + 0.00012875502835469593 f[9]*f[1] + 0.00018102137081077994 f[10]*f[1] + 0.00023799035806169666 f[11]*f[1] + 0.00029647630828148053 f[12]*f[1] + 0.00035369526625692515 f[13]*f[1] + 0.00040754667306948665 f[14]*f[1] + 0.000456673612395692 f[15]*f[1] + 0.0005003816876960387 f[16]*f[1] + 0.0005384933808265713 f[17]*f[1] + 0.0005711918030349512 f[18]*f[1] + 0.0005988834098988423 f[19]*f[1] + 0.0006220911227702024 f[20]*f[1] + 0.0006413784296372176 f[21]*f[1] + 0.0006572998857866318 f[22]*f[1] + 0.0006703718813086261 f[23]*f[1] + 0.000681057799433155 f[24]*f[1] + 0.0006897627124368662 f[25]*f[1] + 0.0006968339938715336 f[26]*f[1] + 0.0007025652917383758 f[27]*f[1] + 0.0007072021289371369 f[28]*f[1] + 0.0007109480543175646 f[29]*f[1] + 0.0007139707137770652 f[30]*f[1] + [[...8390596 terms omitted...]] + 13698.321297395541 f[4095]*f[4089] + 13715.376530514337 f[4096]*f[4089] + 6795.448522945562 f[4090]² + 13640.4961655343 f[4091]*f[4090] + 13681.314586976976 f[4092]*f[4090] +

and tries to minimise that.
I’ve got no idea whether that’s the “normal” way of doing it or not, but generating this expression takes quite a lot of time.
I already tried OSQP to solve it but it doesn’t even converge.

Therefore, I thought before testing everything that supports QP, I might as well ask for the correct way to give the problem to JuMP.

Regarding sparsity, K is unfortunately not sparse at all, there’s not a single zero in there.
Here is a link to a onedrive folder with a K and g example (sized 225x4096 and 225x1, respectively, these are the “small” cases). You can read them like readdlm("K.dat", ','); readdlm("g.dat", ','). The scalar α can be any value, say, 1. The goal is to solve repeatedly for many many values of α and choose the nicest one.

I know there are codes that can effortlessly handle this problem, because I’m using them already, written by other people in matlab. Unfortunately, I can’t access and share the code.

stevengj · March 5, 2024, 5:29pm

As soon as you form the huge matrix K^T K in this sort of problem, you lose. You really want a method that avoids forming this explicitly.

Two options to consider:

You could just throw the function h(x) = \Vert g - Kx \Vert^2 + \alpha \Vert x \Vert^2 = x^T (K^T K + \alpha I) x - 2 g^T K x + g^T g at a generic gradient-based optimization algorithm with bound constraints x \succeq 0, like L-BFGS. The key point is to compute h and its gradient efficiently: compute h(x) = \Vert \cdots \Vert^2 + \Vert \cdots \Vert^2 via the sum of two norms, which only requires you to compute Kx, without ever forming K^T K + \alpha I, and compute the gradient via \nabla h = 2 [ K^T (Kx) + \alpha x] - 2K^T g (parens indicate order of operation), using only matrix–vector operations and never matrix–matrix (AD should also be capable of doing this for you).
You could implement a more specialized method that exploits the specific structure of this problem. See the thread Suggestions needed for bound-constrained least squares solver for inspiration, though you’ll need to modify the solutions to take into account the \alpha term to avoid explicitly forming the huge matrix [K; \alpha I].

stevengj · March 5, 2024, 5:33pm

Any QP method that relies on explicitly forming the matrix will suck here (unless K is sparse, I guess, and the method explicitly supports sparsity).

You shouldn’t need for it to be sparse because viewed as a least-square problem the matrix K is really quite small — for example, rand(300, 10000) \ rand(300) takes a fraction of a second.

gdalle · March 5, 2024, 5:53pm

Yeah good point. I think both JuMP and Convex are able to handle an objective formulated as a square norm, but I don’t know if JuMP’s reverse mode AD or Convex’s reformulation will get around materializing K' * K.

lrnv · March 5, 2024, 5:57pm

According to the docs there Supported Operations · Convex.jl, COnvex.jl should bve able to handle sumsquares(K*x) correctly.

I also think that you should be able from the instantiated problem & solver in Convex.jl to check what happend in the tree, to see if K'*K was computed or not only from sizes of matrices.

Docs on ridge regression there : Lasso, Ridge and Elastic Net Regressions · Convex.jl which corresponds to what you want + the positivity constraint.

odow · March 5, 2024, 8:15pm

The JuMP approach would be something like this:

using JuMP, SCS

function main(K, g, α)
    model = Model(SCS.Optimizer)
    @variable(model, f[1:size(K, 2)] >= 0)
    @variable(model, t[1:2])
    @constraint(model, [t[1]; 0.5; g - K * f] in RotatedSecondOrderCone())
    @constraint(model, [t[2]; 0.5; f] in RotatedSecondOrderCone())
    @objective(model, Min, t[1] + α * t[2])
    optimize!(model)
    @assert is_solved_and_feasible(model)
    return value.(f)
end

m, n = 300, 10_000
K = rand(m, n);
g = rand(m);
α = 0.5
main(K, g, α)

See the docs:

dpo · March 5, 2024, 8:30pm

Interior-point methods would be very well suited for this kind of problem. We call it a bound-constrained linear least-squares problem. An easy way to solve it is to rewrite it as

\begin{align*} \min_{f,r} & \ \|r\|^2 + \alpha \|f\|^2 \\ \mathrm{subject \ to} & \ Kf + r = g \\ & f \geq 0. \end{align*}

That formulation will avoid any occurrence of K^T K, which could be dense and ill conditioned.

The problem can be solved using RipQP.

abraemer · March 5, 2024, 9:09pm

I don’t have anything to add to this thread because it’s not my area of expertise. However this caught my eye: It sounds like you want to optimize alpha with respect to some measure as well. Perhaps it would make sense to combine this into a larger optimization problem to get the desired alpha directly? Perhaps you’d get an even better solution for you “real” problem if you explain what the “true” objective of your code is

aris · March 5, 2024, 10:30pm

Thank you everyone for the comments, they are very helpful.

@stevengj Thanks for the insights. Avoiding K^TK seems very sensible indeed. As for your 1. recommendation, googling “L-BFGS julia” comes up with optim.jl as the package to use. Is this what you would use for this problem, or would we be able to stick to JuMP?

@odow Thank you for the clear example. The JuMP approach you reccommend seems to work well for the random matrices you provide, but it doesnt’ converge for the data I’m using. There is of course a possibility of me messing up the preprocessing which leads to the numbers in K and g, so I’ll have to further investigate whether that’s a limitation of this formulation or an error on my side.

I tried looking at the documentation for both Convex.jl recommended by @lrnv and RipQP.jl recommended by @dpo, but I can’t seem to find out how I can express my problem in the language of these packages (perhaps I’m a bit too tired and I should look again with a clear mind tomorrow).

@abraemer From what I’ve seen in the literature, people seem to solve repeatedly for different alphas until some condition is met (different options are available to choose the right alpha, such as L-curve or GCV). Perhaps reformulating the problem as you suggest might be more efficient, but I’d assume in this case others would have implemented it as well. As you can probably tell, I’m nowhere near being an expert on the field as well, so I’m blindly following resources until I’m able to understand a bit more.

For anyone interested in more details for any reason, here is a paper describing the code I’ve been trying to reproduce.

odow · March 5, 2024, 11:01pm

Where is K from? The condition number is quite large:

julia> LinearAlgebra.cond(K)
5.806497173635263e6

The likely issue is that your K matrix has a bunch of very small (but non-zero) elements like 2.530735066695972e-12.

I got it to solve with Clarabel.jl and by using || f ||_2 instead of ||f||^2

julia> using JuMP, Clarabel, DelimitedFiles

julia> function main(K_filename, g_filename, α = 1)
           K = DelimitedFiles.readdlm(K_filename, ',')
           g = vec(DelimitedFiles.readdlm(g_filename))
           model = Model(Clarabel.Optimizer)
           @variable(model, f[1:size(K, 2)] >= 0)
           @variable(model, t[1:2])
           @constraint(model, [t[1]; g - K * f] in SecondOrderCone())
           @constraint(model, [t[2]; f] in SecondOrderCone())
           @objective(model, Min, t[1] + α * t[2])
           optimize!(model)
           @assert is_solved_and_feasible(model)
           return value.(f)
       end
main (generic function with 3 methods)

julia> f = main("K.dat", "g.dat");
-------------------------------------------------------------
           Clarabel.jl v0.7.1  -  Clever Acronym              
                   (c) Paul Goulart                          
                University of Oxford, 2022                   
-------------------------------------------------------------

problem:
  variables     = 4098
  constraints   = 8419
  nnz(P)        = 0
  nnz(A)        = 929794
  cones (total) = 3
    : Nonnegative = 1,  numel = 4096
    : SecondOrder = 2,  numel = (4097,226)

settings:
  linear algebra: direct / qdldl, precision: Float64
  max iter = 200, time limit = Inf,  max step = 0.990
  tol_feas = 1.0e-08, tol_gap_abs = 1.0e-08, tol_gap_rel = 1.0e-08,
  static reg : on, ϵ1 = 1.0e-08, ϵ2 = 4.9e-32
  dynamic reg: on, ϵ = 1.0e-13, δ = 2.0e-07
  iter refine: on, reltol = 1.0e-13, abstol = 1.0e-12, 
               max iter = 10, stop ratio = 5.0
  equilibrate: on, min_scale = 1.0e-05, max_scale = 1.0e+05
               max iter = 10

iter    pcost        dcost       gap       pres      dres      k/t        μ       step      
---------------------------------------------------------------------------------------------
  0   0.0000e+00  -0.0000e+00  0.00e+00  9.34e-01  7.06e-03  1.00e+00  3.61e+05   ------   
  1   1.0070e+05   2.0599e+05  1.05e+00  8.38e-01  1.70e-03  1.05e+05  3.03e+05  4.32e-01  
  2   1.9181e+05   2.2772e+05  1.87e-01  7.39e-01  1.17e-03  3.59e+04  3.00e+05  6.55e-02  
  3   1.0162e+05   1.3146e+05  2.94e-01  7.30e-01  1.46e-03  2.98e+04  2.96e+05  7.11e-02  
  4   5.3457e+04   6.4570e+04  2.08e-01  5.89e-01  1.08e-03  1.11e+04  2.46e+05  2.74e-01  
  5   5.4281e+04   6.7608e+04  2.46e-01  6.08e-01  1.23e-03  1.33e+04  2.39e+05  9.16e-02  
  6   2.9245e+04   3.5426e+04  2.11e-01  4.88e-01  1.03e-03  6.18e+03  1.91e+05  3.24e-01  
  7   1.4755e+04   1.6753e+04  1.35e-01  3.00e-01  6.18e-04  2.00e+03  1.12e+05  5.45e-01  
  8   1.3096e+04   1.4807e+04  1.31e-01  2.65e-01  5.34e-04  1.71e+03  9.18e+04  3.47e-01  
  9   6.5932e+03   7.2587e+03  1.01e-01  1.57e-01  3.91e-04  6.66e+02  5.95e+04  4.53e-01  
 10   4.1205e+03   4.2171e+03  2.34e-02  5.19e-02  1.54e-04  9.65e+01  2.74e+04  6.84e-01  
 11   3.9859e+03   4.0721e+03  2.16e-02  4.65e-02  1.43e-04  8.62e+01  2.50e+04  2.51e-01  
 12   3.6129e+03   3.6233e+03  2.88e-03  1.51e-02  5.13e-05  1.04e+01  1.44e+04  6.24e-01  
 13   3.5228e+03   3.5283e+03  1.58e-03  7.64e-03  2.67e-05  5.56e+00  7.97e+03  5.59e-01  
 14   3.5025e+03   3.5049e+03  6.95e-04  4.61e-03  1.61e-05  2.43e+00  5.22e+03  6.07e-01  
 15   3.5121e+03   3.5135e+03  3.80e-04  2.38e-03  8.32e-06  1.33e+00  2.65e+03  6.16e-01  
 16   3.5788e+03   3.5803e+03  4.10e-04  2.15e-03  7.39e-06  1.47e+00  1.46e+03  6.39e-01  
 17   3.6455e+03   3.6467e+03  3.22e-04  1.72e-03  5.84e-06  1.17e+00  1.00e+03  4.10e-01  
 18   3.7313e+03   3.7324e+03  2.81e-04  1.38e-03  4.65e-06  1.05e+00  6.23e+02  4.66e-01  
 19   3.8366e+03   3.8375e+03  2.46e-04  1.12e-03  3.73e-06  9.43e-01  3.95e+02  4.52e-01  
 20   3.9598e+03   3.9606e+03  2.11e-04  9.08e-04  2.99e-06  8.34e-01  2.53e+02  4.62e-01  
 21   4.1338e+03   4.1346e+03  1.91e-04  7.20e-04  2.34e-06  7.90e-01  1.43e+02  5.57e-01  
 22   4.5426e+03   4.5434e+03  1.74e-04  4.54e-04  1.43e-06  7.88e-01  5.34e+01  7.13e-01  
 23   4.7120e+03   4.7125e+03  9.11e-05  3.23e-04  1.02e-06  4.29e-01  4.02e+01  5.15e-01  
 24   5.0253e+03   5.0257e+03  7.27e-05  2.18e-04  6.80e-07  3.65e-01  2.03e+01  5.79e-01  
 25   5.1793e+03   5.1797e+03  8.31e-05  1.98e-04  6.12e-07  4.31e-01  1.31e+01  4.95e-01  
 26   5.5708e+03   5.5714e+03  9.43e-05  1.50e-04  4.50e-07  5.25e-01  5.86e+00  6.36e-01  
 27   6.1250e+03   6.1253e+03  5.93e-05  7.72e-05  2.26e-07  3.63e-01  2.22e+00  8.70e-01  
 28   6.3219e+03   6.3223e+03  5.55e-05  6.35e-05  1.85e-07  3.51e-01  1.51e+00  5.40e-01  
 29   6.7292e+03   6.7294e+03  1.95e-05  2.11e-05  6.12e-08  1.32e-01  5.52e-01  9.74e-01  
 30   6.8932e+03   6.8933e+03  1.39e-05  1.45e-05  4.17e-08  9.60e-02  3.28e-01  6.41e-01  
 31   7.0217e+03   7.0218e+03  4.64e-06  4.55e-06  1.30e-08  3.26e-02  1.06e-01  8.70e-01  
 32   7.0633e+03   7.0633e+03  2.15e-06  2.05e-06  5.86e-09  1.52e-02  4.93e-02  9.80e-01  
 33   7.0787e+03   7.0787e+03  1.00e-06  9.37e-07  2.67e-09  7.08e-03  2.24e-02  7.28e-01  
 34   7.0879e+03   7.0879e+03  4.44e-07  4.09e-07  1.17e-09  3.14e-03  9.76e-03  8.75e-01  
 35   7.0915e+03   7.0915e+03  1.23e-07  1.13e-07  3.23e-10  8.75e-04  2.70e-03  8.90e-01  
 36   7.0927e+03   7.0927e+03  3.39e-08  3.11e-08  8.87e-11  2.40e-04  7.42e-04  8.50e-01  
 37   7.0930e+03   7.0930e+03  7.87e-09  7.23e-09  2.06e-11  5.58e-05  1.72e-04  8.25e-01  
---------------------------------------------------------------------------------------------
Terminated with status = solved
solve time =  8.88s

The Ipopt code would be:

function main(K_filename, g_filename, α = 1)
    K = DelimitedFiles.readdlm(K_filename, ',')
    g = vec(DelimitedFiles.readdlm(g_filename))
    model = Model(Ipopt.Optimizer)
    @variable(model, f[1:size(K, 2)] >= 0)
    @variable(model, y[1:size(K, 1)])
    @constraint(model, y .== g .- K * f)
    @objective(model, Min, sum(y.^2) + α * sum(f.^2))
    optimize!(model)
    @assert is_solved_and_feasible(model)
    return value.(f)
end

but the numerics are not nice.

geoffroyleconte · March 5, 2024, 11:24pm

To complete @dpo’s answer and to conform with the notations of Reference · QuadraticModels.jl , I think your problem can be written as

\min_x \tfrac{1}{2}x^THx \quad \text{s.t.} \quad Ax = b, \quad \text{and} \quad x_i \ge 0, \quad i=1, ..., n,

where

x = \begin{bmatrix} f \\ r \end{bmatrix}, H = \begin{bmatrix} 2 \alpha I & 0 \\ 0 & I \end{bmatrix}, A = \begin{bmatrix} K & I \end{bmatrix}, b = g, and f \in \mathbb{R}^n

Then, a small example with RipQP.jl would give

using RipQP, QuadraticModels, LinearAlgebra, SparseArrays
m, n = 30, 100
K = rand(m, n)
α = 1.0
g = rand(m)
A = [sparse(K) I]
H = [2*α*I  spzeros(n, m);
     spzeros(m, n)  2*I]
c = zeros(m + n)
lvar = [zeros(n); fill(-Inf, m)]
qm = QuadraticModel(c, H; A = A, lcon = g, ucon = g, lvar = lvar)
stats = ripqp(qm)
f, r = stats.solution[1:n], stats.solution[n+1:n+m]

stevengj · March 6, 2024, 12:40am

I’m not sure why the condition number should matter here because there is a Tikhonov regularization (the \alpha term), no?

odow · March 6, 2024, 2:41am

I’m not sure why the condition number should matter

SCS etc solve a linear system K * f + y = g, they don’t solve the quadratic matrix directly. It’s not necessarily a problem, but it does hint at why a variety of different solvers (Ipopt, SCS, Clarabel) struggle to solve this.

RipQP works nicely:

julia> using RipQP, QuadraticModels, SparseArrays, DelimitedFiles, LinearAlgebra

julia> function main(K_filename, g_filename, α = 1)
            K = DelimitedFiles.readdlm(K_filename, ',')
            g = vec(DelimitedFiles.readdlm(g_filename))
            m, n = size(K)
            A = [SparseArrays.sparse(K) LinearAlgebra.I]
            H = [
                 α*2*LinearAlgebra.I SparseArrays.spzeros(n, m)
                 SparseArrays.spzeros(m, n)  2*LinearAlgebra.I
            ]
            c = zeros(m + n)
            lvar = vcat(zeros(n), fill(-Inf, m))
            qm = QuadraticModel(c, H; A = A, lcon = g, ucon = g, lvar = lvar)
            stats = ripqp(qm)
            return stats.solution[1:n]
       end
main (generic function with 2 methods)

julia> @time main("/Users/oscar/Downloads/OneDrive_1_3-6-2024/K.dat", "/Users/oscar/Downloads/OneDrive_1_3-6-2024/g.dat");
[ Info: Solving in Float64 using K2LDL with LDLFactorizationData
[ Info:   iter       obj      rgap      ‖rb‖      ‖rc‖     α_pri      α_du         μ         ρ         δ  
[ Info:      0   1.8e+08   1.7e+00   4.0e+05   1.2e+03   0.0e+00   0.0e+00   5.3e+05   1.5e-03   1.5e-03
[ Info:      1   4.3e+07   3.0e+00   6.3e+04   1.9e+02   8.4e-01   9.7e-01   9.3e+04   1.5e-03   1.5e-03
[ Info:      2   2.1e+07   2.5e+00   1.6e+04   6.5e+01   7.4e-01   8.4e-01   2.6e+04   1.5e-04   1.5e-04
[ Info:      3   1.6e+07   9.8e-01   5.5e+03   3.0e+02   6.6e-01   9.5e-01   7.9e+03   1.5e-05   1.5e-05
[ Info:      4   1.5e+07   1.2e-01   2.7e+03   4.0e+02   5.1e-01   8.6e-01   3.2e+03   1.5e-06   1.5e-06
[ Info:      5   1.5e+07   4.4e-01   1.8e+03   8.7e+02   3.6e-01   9.1e-01   2.1e+03   1.5e-07   1.5e-07
[ Info:      6   1.7e+07   7.8e-01   1.1e+03   1.2e+03   3.9e-01   1.0e+00   1.8e+03   1.5e-08   1.5e-08
[ Info:      7   2.0e+07   5.4e-01   6.5e+02   1.0e+03   3.9e-01   1.0e+00   1.7e+03   1.5e-09   1.5e-09
[ Info:      8   2.4e+07   1.2e-01   1.7e+02   2.5e+02   7.5e-01   9.8e-01   4.5e+02   1.5e-10   1.5e-09
[ Info:      9   2.5e+07   9.4e-03   1.7e-07   3.5e-09   1.0e+00   1.0e+00   5.9e+01   1.5e-11   1.5e-09
[ Info:     10   2.5e+07   4.1e-04   1.5e-07   5.8e+00   9.9e-01   7.7e-01   8.5e+00   1.5e-12   1.5e-09
[ Info:     11   2.5e+07   7.5e-05   4.3e-08   1.7e+00   1.0e+00   8.3e-01   1.6e+00   1.5e-13   1.5e-09
[ Info:     12   2.5e+07   6.1e-06   9.8e-09   2.8e-01   1.0e+00   9.0e-01   1.8e-01   1.5e-14   1.5e-09
[ Info:     13   2.5e+07   2.6e-07   1.3e-09   5.3e-02   1.0e+00   9.2e-01   1.6e-02   1.5e-14   1.5e-09
[ Info:     14   2.5e+07   5.1e-08   2.0e-10   6.2e-03   1.0e+00   9.7e-01   8.4e-04   1.5e-14   1.5e-09
[ Info:     15   2.5e+07   5.6e-09   1.3e-10   5.3e-04   1.0e+00   9.9e-01   4.0e-05   1.5e-14   1.5e-09
  3.773461 seconds (2.69 M allocations: 314.398 MiB, 2.15% gc time)

Gurobi also works:

julia> using JuMP

julia> import Gurobi

julia> function main_grb(K_filename, g_filename, α = 1)
            K = DelimitedFiles.readdlm(K_filename, ',')
            g = vec(DelimitedFiles.readdlm(g_filename))
            model = Model(Gurobi.Optimizer)
            @variable(model, f[1:size(K, 2)] >= 0)
            @variable(model, y[1:size(K, 1)])
            @constraint(model, y .== g .- K * f)
            @objective(model, Min, sum(y.^2) + α * sum(f.^2))
            optimize!(model)
            @assert is_solved_and_feasible(model)
            return value.(f)
       end
main_grb (generic function with 2 methods)

julia> @time main_grb("/Users/oscar/Downloads/OneDrive_1_3-6-2024/K.dat", "/Users/oscar/Downloads/OneDrive_1_3-6-2024/g.dat");
Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (mac64[x86] - Darwin 22.6.0 22G91)

CPU model: Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads

Optimize a model with 225 rows, 4321 columns and 921825 nonzeros
Model fingerprint: 0x9c35740a
Model has 4321 quadratic objective terms
Coefficient statistics:
  Matrix range     [3e-12, 1e+02]
  Objective range  [0e+00, 0e+00]
  QObjective range [2e+00, 2e+00]
  Bounds range     [0e+00, 0e+00]
  RHS range        [4e-01, 4e+05]
Warning: Model contains large matrix coefficient range
         Consider reformulating model or setting NumericFocus parameter
         to avoid numerical issues.
Presolve time: 0.16s
Presolved: 225 rows, 4321 columns, 921825 nonzeros
Presolved model has 4321 quadratic objective terms
Ordering time: 0.00s

Barrier statistics:
 Free vars  : 225
 AA' NZ     : 2.520e+04
 Factor NZ  : 2.542e+04 (roughly 2 MB of memory)
 Factor Ops : 3.822e+06 (less than 1 second per iteration)
 Threads    : 4

                  Objective                Residual
Iter       Primal          Dual         Primal    Dual     Compl     Time
   0   1.41172704e+12 -1.39431763e+12  1.89e+07 1.60e+07  2.22e+08     0s
   1   2.52149983e+10 -2.42290713e+10  1.71e+06 1.45e+06  2.31e+07     0s
   2   2.55669320e+09 -3.00982544e+09  1.80e+05 1.53e+05  3.25e+06     0s
   3   4.40159458e+08 -7.67993209e+08  8.44e+03 7.17e+03  3.87e+05     0s
   4   1.82155499e+08 -3.27756597e+08  2.48e+03 2.10e+03  1.49e+05     0s
   5   7.00973070e+07 -1.00143084e+08  5.95e+02 5.05e+02  4.68e+04     0s
   6   5.13896376e+07 -5.99732943e+07  3.70e+02 3.15e+02  3.05e+04     0s
   7   3.96983682e+07 -3.16841059e+07  2.30e+02 1.96e+02  2.11e+04     0s
   8   3.10010761e+07  1.28866269e+07  1.97e+01 1.68e+01  5.06e+03     0s
   9   2.81235382e+07  2.13342233e+07  1.97e-05 1.68e-05  1.66e+03     0s
  10   2.61032064e+07  2.44705480e+07  3.01e-06 2.56e-06  3.99e+02     0s
  11   2.55169254e+07  2.51226131e+07  1.20e-10 9.59e-11  9.63e+01     1s
  12   2.53592819e+07  2.52785079e+07  5.46e-11 1.30e-10  1.97e+01     1s
  13   2.53242850e+07  2.53075656e+07  1.49e-10 1.58e-10  4.08e+00     1s
  14   2.53158555e+07  2.53137408e+07  9.82e-11 1.42e-10  5.16e-01     1s
  15   2.53147316e+07  2.53145119e+07  5.09e-11 1.71e-10  5.36e-02     1s
  16   2.53146065e+07  2.53145989e+07  8.37e-11 1.22e-10  1.85e-03     1s
  17   2.53146027e+07  2.53146023e+07  5.82e-11 1.27e-10  1.01e-04     1s
  18   2.53146025e+07  2.53146025e+07  1.35e-10 1.18e-10  8.73e-06     1s
  19   2.53146025e+07  2.53146025e+07  1.06e-10 1.17e-10  6.36e-07     1s
  20   2.53146025e+07  2.53146025e+07  1.60e-10 1.75e-10  2.05e-08     1s
  21   2.53146025e+07  2.53146025e+07  1.31e-10 8.73e-11  9.50e-10     1s

Barrier solved model in 21 iterations and 0.72 seconds (0.69 work units)
Optimal objective 2.53146025e+07


User-callback calls 213, time in user-callback 0.00 sec
  2.279497 seconds (2.88 M allocations: 488.944 MiB, 14.80% gc time, 2.55% compilation time)

rdavis120 · March 6, 2024, 2:47am

You might want to take a look at this package which is closer to the R quadprog package GoldfarbIdnaniSolver it has some benchmarks to other solvers too.

dpo · March 6, 2024, 3:18am

Aside from everything else, that’s really not such a large condition number.

gdalle · March 6, 2024, 7:15am

One of the benefits of formulating your problem with fixed \alpha using JuMP is that you would be able to differentiate through the optimization problem itself with DiffOpt.jl, and thus perform hyperparameter tuning on \alpha in a more principled way

Topic		Replies	Views
Can not Run nonlinear optimization example using Ipopt Optimization (Mathematical) jump	3	710	February 8, 2022
Seeking a smart way to formulate an optimization problem Optimization (Mathematical)	4	494	November 6, 2021
Large optimal control problems in Julia General Usage	2	320	October 11, 2022
Quadratic Programming Problem - CPL Optimization (Mathematical) jump , optimization	9	3469	April 10, 2019
Solving nonlinear constrained optimization problem Optimization (Mathematical) question , jump	35	1244	March 18, 2024

How to solve a large quadratic programming problem

Related topics