Solve the System of Equations (J' J + μI) x = J'v

Norman · July 9, 2020, 3:01am

I have a square sparse matrix J of size n * n where n can be around 1 million.
μ is a scalar. I is the identity matrix. v is a vector.
The system to be solved is

(J'J + μI) x = J'v

What I have now is the following:

H = J'*J + spdiagm(0=>fill(μ, n))
B = J'*v
x = H\B

J is of type SparseMatrixCSC and n can be pretty large.

Is there some better way to solve this system?

Here is a coarse comparison between Julia and Matlab on my computer.
I am not sure whether this comparison is fair. But, I am very curious on how I could improve the Julia code to make things faster.

Note that although I am using a tridiagonal matrix here, the matrix J may take other form in actual usage. It is a Jacobian matrix for a Fischer-Burmeister function.

Julia:

using LinearAlgebra, SparseArrays

function main()
    n = 1000000
    J = spdiagm(0=>[9;fill(17,n-2);9], -1=>fill(-8,n-1), 1=>fill(-8,n-1))
    v = fill(0.05,n)

    H = J'*J + spdiagm(0=>fill(0.05, n))
    B = J'*v
    @time x = H\B
    return x
end

main()
% 0.394470 seconds (48 allocations: 534.059 MiB, 1.99% gc time)

Matlab:

n = 1000000;
J = spdiags([-8*ones(n,1) [9;17*ones(n-2,1);9] -8*ones(n,1)], [-1 0 1], n, n);
v = 0.05*ones(n,1);

H = J'*J + spdiags(0.05*ones(n,1), 0, n, n);
B = J'*v;
tic;
x = H\B;
toc;
% Elapsed time is 0.101177 seconds.

Thanks for your time!

The suggestion from @ettersi really makes a difference.

using LinearAlgebra, SparseArrays
using IterativeSolvers

function main2()
    n = 1000000
    J = spdiagm(0=>[9;fill(17,n-2);9], -1=>fill(-8,n-1), 1=>fill(-8,n-1))
    v = fill(0.05,n)
    @time x = lsmr(J, v, λ=sqrt(0.05))
    return x
end

main2()
% 0.033991 seconds (34 allocations: 61.037 MiB)

lucas711642 · July 9, 2020, 3:45am

If μ is positive, then you can use the shift keyword of cholesky, replacing

(J' * J + spdiagm(0 => fill(μ, n))) \ B

by

cholesky(J' * J; shift = μ) \ B

Note that you don’t need to construct the diagonal matrix in your code, and you could have used H = J' * J + μ * I instead.

ettersi · July 9, 2020, 4:01am

Your linear system are the normal equations of the least-squares problem

\min_x \|\hat J x - \hat v\|_2

where

\hat J = \begin{pmatrix} J \\ \sqrt{\mu} \, I_n \end{pmatrix} ,\qquad \hat v = \begin{pmatrix} v \\ 0_n \end{pmatrix}

with I_n, 0_n the n \times n identity and zero matrices, respectively. It is usually recommended to tackle the original least squares problem rather than the normal equations since the normal equations are much more ill-conditioned.

Since your least squares is very large and sparse, I recommend you try solving it using iterative methods like LSMR or LSQR which are implemented in IterativeSolvers.jl.

Joris_Pinkse · July 9, 2020, 4:04am

Julia compiles things before it runs. This means that if you only run some code once, then it can take longer. So only time on the second run (add another main call) to make the comparison fair. Julia speed is moreover sensitive to the way you programs things. I don’t know enough about the sparse array functions to comment. When I run your code, Julia appears to only be using one thread, whereas Matlab may be using more. There is a limit on the number of threads that OpenBLAS uses (8?), which is probably not a constraint here, but which you can change if you compile from source. There are other libraries that you could use like MKL. Finally, look into PackageCompiler: it’s my best friend.

jlperla · July 9, 2020, 4:50am

yes. To reiterate, those algorithms implement the L2 regularized least squares directly, which I believe is the exact form of your problem.

mohamed82008 · July 9, 2020, 9:31am

I don’t think this reformulation is correct. The normal equations for the least squares problem you wrote are (J' J + \mu \textbf{1}) x = J' v. Notice the \mu \textbf{1} not the original \mu I.

Edit: the equations actually describe the optimality conditions of minimizing || J x - v ||_2^2 + \mu ||x||_2^2.
Edit 2: if \mu is positive then J' J + \mu I is positive definite, so you can use IterativeSolvers.cg. You may not need to form J' J + \mu I at all, just use the matrix-free API from https://juliamath.github.io/IterativeSolvers.jl/dev/getting_started/#matrixfree-1.

ettersi · July 10, 2020, 2:36pm

Correct, my original formulation was wrong. I’ve updated my answer.
@Norman’s solution is correct anyway since he used the \lambda parameter of lsmr rather than the augmented linear system.

Applying CG to the normal equations does not solve the ill-conditioning problem since the matvec J^TJ \, v wipes out information in v regardless of how you evaluate the product. This is the magic of lsqr, it implicitly runs CG on J^TJ + \mu I, but it does it in a clever way to avoid this loss of information.

Topic		Replies	Views
Julia can't solve sparse linear system, MATLAB can General Usage matlab , linearalgebra , sparse	5	868	May 26, 2022
Sparse matrix-vector product: much more slow than Matlab Performance matlab , optimization	24	4517	December 20, 2017
How to solve this Ax=b faster? Numerics linearalgebra , linearsolve	24	2450	June 7, 2022
Julia is slower than matlab when it comes to matrix diagonalization? General Usage question , eigs , eigenvalue-problem	39	1627	January 14, 2024
Adding to sparse matrix efficiently and IterativeSolvers.gmres speed vs MATLAB General Usage	3	1382	March 1, 2019

Solve the System of Equations (J' J + μI) x = J'v

Related topics