Solve the System of Equations (J' J + μI) x = J'v

I have a square sparse matrix J of size n * n where n can be around 1 million.
μ is a scalar. I is the identity matrix. v is a vector.
The system to be solved is

(J'J + μI) x = J'v

What I have now is the following:

H = J'*J + spdiagm(0=>fill(μ, n))
B = J'*v
x = H\B

J is of type SparseMatrixCSC and n can be pretty large.

Is there some better way to solve this system?

Here is a coarse comparison between Julia and Matlab on my computer.
I am not sure whether this comparison is fair. But, I am very curious on how I could improve the Julia code to make things faster.

Note that although I am using a tridiagonal matrix here, the matrix J may take other form in actual usage. It is a Jacobian matrix for a Fischer-Burmeister function.


using LinearAlgebra, SparseArrays

function main()
    n = 1000000
    J = spdiagm(0=>[9;fill(17,n-2);9], -1=>fill(-8,n-1), 1=>fill(-8,n-1))
    v = fill(0.05,n)

    H = J'*J + spdiagm(0=>fill(0.05, n))
    B = J'*v
    @time x = H\B
    return x

% 0.394470 seconds (48 allocations: 534.059 MiB, 1.99% gc time)


n = 1000000;
J = spdiags([-8*ones(n,1) [9;17*ones(n-2,1);9] -8*ones(n,1)], [-1 0 1], n, n);
v = 0.05*ones(n,1);

H = J'*J + spdiags(0.05*ones(n,1), 0, n, n);
B = J'*v;
x = H\B;
% Elapsed time is 0.101177 seconds.

Thanks for your time!

The suggestion from @ettersi really makes a difference.

using LinearAlgebra, SparseArrays
using IterativeSolvers

function main2()
    n = 1000000
    J = spdiagm(0=>[9;fill(17,n-2);9], -1=>fill(-8,n-1), 1=>fill(-8,n-1))
    v = fill(0.05,n)
    @time x = lsmr(J, v, λ=sqrt(0.05))
    return x

% 0.033991 seconds (34 allocations: 61.037 MiB)

If μ is positive, then you can use the shift keyword of cholesky, replacing

(J' * J + spdiagm(0 => fill(μ, n))) \ B


cholesky(J' * J; shift = μ) \ B

Note that you don’t need to construct the diagonal matrix in your code, and you could have used H = J' * J + μ * I instead.


Your linear system are the normal equations of the least-squares problem

\min_x \|\hat J x - \hat v\|_2


\hat J = \begin{pmatrix} J \\ \sqrt{\mu} \, I_n \end{pmatrix} ,\qquad \hat v = \begin{pmatrix} v \\ 0_n \end{pmatrix}

with I_n, 0_n the n \times n identity and zero matrices, respectively. It is usually recommended to tackle the original least squares problem rather than the normal equations since the normal equations are much more ill-conditioned.

Since your least squares is very large and sparse, I recommend you try solving it using iterative methods like LSMR or LSQR which are implemented in IterativeSolvers.jl.


Julia compiles things before it runs. This means that if you only run some code once, then it can take longer. So only time on the second run (add another main call) to make the comparison fair. Julia speed is moreover sensitive to the way you programs things. I don’t know enough about the sparse array functions to comment. When I run your code, Julia appears to only be using one thread, whereas Matlab may be using more. There is a limit on the number of threads that OpenBLAS uses (8?), which is probably not a constraint here, but which you can change if you compile from source. There are other libraries that you could use like MKL. Finally, look into PackageCompiler: it’s my best friend.

1 Like

yes. To reiterate, those algorithms implement the L2 regularized least squares directly, which I believe is the exact form of your problem.

I don’t think this reformulation is correct. The normal equations for the least squares problem you wrote are (J' J + \mu \textbf{1}) x = J' v. Notice the \mu \textbf{1} not the original \mu I.

Edit: the equations actually describe the optimality conditions of minimizing || J x - v ||_2^2 + \mu ||x||_2^2.
Edit 2: if \mu is positive then J' J + \mu I is positive definite, so you can use You may not need to form J' J + \mu I at all, just use the matrix-free API from


Correct, my original formulation was wrong. I’ve updated my answer.
@Norman’s solution is correct anyway since he used the \lambda parameter of lsmr rather than the augmented linear system.

Applying CG to the normal equations does not solve the ill-conditioning problem since the matvec J^TJ \, v wipes out information in v regardless of how you evaluate the product. This is the magic of lsqr, it implicitly runs CG on J^TJ + \mu I, but it does it in a clever way to avoid this loss of information.