Solving Optimization of Least Squares Term with Quadratic Term with Memory Infeasible Matrix

RoyiAvital · September 23, 2025, 4:22pm

I want to solve:

\arg \min_{\boldsymbol{x}} \frac{\lambda}{2} \boldsymbol{x}^{T} \boldsymbol{E}^{T} \boldsymbol{K} \boldsymbol{E} \boldsymbol{x} + \frac{\rho}{2} {\left\| \boldsymbol{A} \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2}

Where \rho, \lambda > 0, \boldsymbol{E} = \left[ \boldsymbol{I}_{n}, \; \boldsymbol{0}_{n} \right] and \boldsymbol{A} = \operatorname{Diag} \left( \boldsymbol{y} \right) \left[ \boldsymbol{K}, \; \boldsymbol{1} \right].
The matix \boldsymbol{K} is a Kernel Matrix.
It is SPSD matrix, yet I can’t hold it in memory. I can only calculate elements of it, row / column or apply it on a vector / matrix.

Let \tilde{\boldsymbol{x}} = \boldsymbol{E} \boldsymbol{x} = \left[ {x}_{1}, {x}_{2}, \ldots, {x}_{n - 1} \right]^{\top} then \boldsymbol{A} \boldsymbol{x} = \begin{bmatrix} {y}_{1} \left( \boldsymbol{k}_{1}^{T} \tilde{\boldsymbol{x}} + {x}_{n} \right) \\ {y}_{2} \left( \boldsymbol{k}_{2}^{T} \tilde{\boldsymbol{x}} + {x}_{n} \right) \\ \vdots \end{bmatrix} where \boldsymbol{k}_{i} is the i -th row / column of \boldsymbol{K}.
This implies {\left\| \boldsymbol{A} \boldsymbol{x} - \boldsymbol{y} \right\|}_{2}^{2} = \begin{bmatrix} {y}_{1} \left( \boldsymbol{k}_{1}^{T} \tilde{\boldsymbol{x}} + {x}_{n} - 1 \right) \\ {y}_{2} \left( \boldsymbol{k}_{2}^{T} \tilde{\boldsymbol{x}} + {x}_{n} - 1 \right) \\ \vdots \end{bmatrix}.

I wonder if there a relatively efficient way to solve the problem given the special structure of the matrices without having \boldsymbol{K} explicitly.

stevengj · September 23, 2025, 11:06pm

Conjugate gradient or other Krylov least-square methods like LSQR or CGLS seem like they could be adapted to this? Iterative methods like this only require matrix–vector multiplications.

RoyiAvital · September 24, 2025, 2:41pm

On first glance LSQR looks good as it can solve the problem:

\arg \min_{\boldsymbol{x}} {\left\| \boldsymbol{A} \boldsymbol{x} - \boldsymbol{b} \right\|}_{2}^{2} + {\lambda}^{2} {\left\| \boldsymbol{x} \right\|}_{\boldsymbol{F}}^{2}

Which can fit to my case with setting \boldsymbol{F} = \boldsymbol{E}^{T} \boldsymbol{K} \boldsymbol{E}.

Yet the issue is the algorithm asks for \boldsymbol{N} = \boldsymbol{F}^{-1} which makes it not feasible to my case.

Unless I miss something.

Another issue is the matrices are required to be SPD while \boldsymbol{E}^{T} \boldsymbol{K} \boldsymbol{E} is only SPSD.

Update

It seems I can control it with ldiv as in Krylov.jl - Pre Conditioners:

Topic		Replies	Views
Finding the least norm solution of least squares problem with a sparse matrix Numerics linearalgebra	2	1479	November 4, 2019
Sparse Cholesky of Gram Matrix (SuiteSparse?) Numerics question , linearalgebra	25	2714	April 11, 2020
JuMP Efficient Quadratic Objective Formulation Optimization (Mathematical)	4	1600	October 1, 2017
Use QR factorization efficiently Performance linearalgebra	12	5295	January 27, 2018
QR factorization with custom dot product Numerics linearalgebra	5	1949	December 30, 2016

Solving Optimization of Least Squares Term with Quadratic Term with Memory Infeasible Matrix

Update

Related topics