Result changes depending on whether matrix is sparse or not

I get this weird bug (I assume) but I cannot reproduce it with another matrix. So this is the setup:

A = sparse([1,2,1,3,3,4,5], [3,3,4,4,5,6,6,], [1,1,1,1,1,1,1], 6, 6 )

A = convert(SparseMatrixCSC{Float64,Int64},A)

d = vec(sum(A,dims=1))

M = spdiagm(0 => d)-transpose(A)

dropzeros!(M)

After this this happens:

julia> qr(M)\d
6-element Array{Float64,1}:
 -2.0
 -0.0
  0.0
  0.0
  1.0
  1.5

julia> qr(Matrix(M),Val(true))\d
6-element Array{Float64,1}:
 -1.4178403755868538
 -0.9389671361502343
 -0.17840375586854448
  0.20187793427230036
  0.8215962441314553
  1.5117370892018778

The second one is the correct result. Is this a bug? Am I doing something wrong here?

The matrix is singular. Using Ax=b terminology, the right-hand-side vector b is in the range of A, so solutions to Ax=b exist, but they are not unique. Different algorithms will produce different answers x, each of which satisfies Ax=b

julia> x2 = qr(Matrix(M),Val(true))\d
6-element Array{Float64,1}:
 -1.4178403755868538 
 -0.9389671361502343 
 -0.17840375586854448
  0.20187793427230036
  0.8215962441314553 
  1.5117370892018778 

julia> x1 = qr(M)\d
6-element Array{Float64,1}:
 -2.0
 -0.0
  0.0
  0.0
  1.0
  1.5

julia> M*x1
6-element Array{Float64,1}:
 0.0
 0.0
 2.0
 2.0
 1.0
 2.0

julia> M*x2
6-element Array{Float64,1}:
 0.0               
 0.0               
 1.9999999999999991
 1.9999999999999991
 0.9999999999999998
 2.0               

The first two rows of M are zero

6×6 Array{Float64,2}:
  0.0   0.0   0.0   0.0   0.0  0.0
  0.0   0.0   0.0   0.0   0.0  0.0
 -1.0  -1.0   2.0   0.0   0.0  0.0
 -1.0   0.0  -1.0   2.0   0.0  0.0
  0.0   0.0  -1.0   0.0   1.0  0.0
  0.0   0.0   0.0  -1.0  -1.0  2.0

so M has at best rank 4. The SVD shows that this is the case: two singular values are zero (to floating-point precision).

julia> (U,sigma,V) = svd(Matrix(M));

julia> sigma
6-element Array{Float64,1}:
 2.9775294768839364   
 2.5083437595312112   
 1.9623854744780629   
 0.9957776096426464   
 1.506335424828049e-16
 8.938809282980296e-18
1 Like

Yes, the matrix is singular but the result of qr(M)\d should be the pseudo-inverse of M applied on d. The pseudo-inverse is unique and the result should be unique.

I wouldn’t make any stronger assumption than that QR backslash provides a minimizer of ||Ax-b||, and a non-unique minimizer for singular A. If you want the pseudo inverse, use pinv.

julia> pinv(Matrix(M))*d
6-element Array{Float64,1}:
 -1.4178403755868545 
 -0.9389671361502349 
 -0.17840375586854493
  0.2018779342723    
  0.8215962441314556 
  1.511737089201878  

Hmmm, my suggestion to use pinv is not a practical one. It’s never a good idea to compute an inverse to solve an Ax=b problem.

Is there any special reason in your application why one solution for x is better than another?

Regarding pseudo-inverse and QR, this thread is relevant.

1 Like

Why do you think it is not a good idea?
Computationally wise it might be better use direct methods to find the solution.
But accuracy wise and certainly if one needs the Least Norm solution of the Least Squares there is nothing wrong about it.

Could it be that the Myth about “Don’t compute the inverse” is overblown?
See How Accurate is inv(A)*b?

By the way, MATLAB has the function lsqminnorm() which uses the complete orthogonal decomposition (COD) to compute the minimum norm least squares solution.
If it is available in Julia one could use it as well as it should be faster than pinv() which uses the SVD.

SuiteSparseQR, which is used when A is sparse, returns a basic least-squares solutions, which means one with many zeros (hence the zeros that you observe). LAPACK, used in the dense case, returns the minimum-norm solution.

@RoyiAvital it tends to be less about accuracy than speed. (https://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/). For ill conditioned matrices, you very well may want to be using some sort of regularized method anyway (https://en.wikipedia.org/wiki/Tikhonov_regularization)

I am not sure about this in general (only for full rank matrices, and your matrix isn’t).

I do think that the result of qr(M)\d should be the minimum norm least squares solutions. If nothing else, just for consistency.

But in any case I have the following problem:

I have a singular sparse matrix M whose dimension can be very large and a vector d that may or may not be in the range of M and I want to compute pinv(M)*d. What is the most efficient way to do that?

Great, so we agree that accuracy wise there is no advantage.
Regarding speed, well, I don’t fully agree with the analysis you linked to.

He argues one can solve the linear system taking advantage of the special properties of the matrix. Yet this can be done for calculation of the inverse as well.
Let’s take Tridiagonal Matrix for instance. You can solve directly using known methods. But you can also do the same tricks for calculating the inverse.

Solving a general N×N tridiagonal system Ax=b takes O(N) time and memory. Computing the inverse of A takes O(N²) time and memory because there are N² nonzero elements of the inverse to compute. This is better than O(N³), but still far worse than not computing the inverse at all.

For a general dense matrix, it’s true that directly computing the inverse is not terrible, maybe a factor of 2 more time than just computing the LU factorization and within a small constant factor of the accuracy. But there’s usually no compelling reason to do it, and it’s arguably a bad habit because it will lead you into trouble for sparse and structured matrices.

1 Like

I totally agree with you.
But in the case of the OP he had no choice but using the Pseudo Inverse.
Then he was warned not to do it as usually most of us are warned in their first numerical course not to invert matrices for solving Linear Equations. Usually we are hinted it has something to do with accuracy and not only speed.

I was just pointing this is something that might be exaggerated.

Oh, I just noticed your question. Yes, I need the solution with minimal norm. The matrix I use is the transpose of the Laplacian of a graph and it can be very large but also very sparse, so I’m convinced there has to be a better way of doing that.

In general, you compute the minimum-norm least-squares solution using a complete orthogonal decomposition (COD). That’s what LAPACK does, but not SuiteSparseQR. It consists in computing a second QR factorization (of R’). There’s a Matlab package named Factorize that, among other things, calls SuiteSparseQR to compute a COD: https://www.mathworks.com/matlabcentral/fileexchange/24119-don-t-let-that-inv-go-past-your-eyes-to-solve-that-system-factorize. As far as I know, that hasn’t been ported to Julia.

1 Like

As I understand it, the latest algorithms to compute minimum-norm solutions using SuiteSparseQR are found in the spqr_rank package described in this paper. In particular, section 2.7 of the paper says that the COD-based algorithm typically requires more time and memory (about a factor of 2 from fig. 6) than the newer algorithm. Both spqr_rank and the older Factorize package are BSD-licensed by Tim Davis and are part of SuiteSparse. Would be nice to port spqr_rank to Julia. (See also #20409.)

1 Like

Thanks. Wouldn’t it be awesome to have all that in pure Julia?