What is the fastest way to find if a real nonsquare matrix is injective?

Hello guys!

The title says all. I have a certain M\times N real valued matrix A and I would like to know if it is injective. What would be the fastest way to accomplish that? I know that I could calculate the determinant of A^\dagger A and check if its zero or not, but I have a feeling that this method is probably very inefficient. If it makes a difference, in my use case, one usually has M \gg N.


If the matrix is not too large, the SVD may be fast enough. A zero singular value tells you the matrix is rank-deficient. A small singular value (ie near roundoff) tells you that there’s trouble. Checking for a zero singular value is not likely to work thanks to rounding errors.

A rank revealing QR would be faster, but I don’t know if there is one of these in Julia.

This is quite efficient (probably much faster than SVD or QR), but is terrible for other reasons:

  1. The determinant will rarely be exactly zero because roundoff errors make it difficult to distinguish singular from near-singular matrices.
  2. In consequence of (1), for a near-singular matrix, you want to check that the determinant is small rather than zero, but small compared to what? (See below.)
  3. Determinants of large matrices can easily overflow the largest representable floating-point value (though you can use logabsdet to avoid this).
  4. Computing A^\dagger A = \overline{A^T} A (also denoted A^* A) squares the condition number of A and exacerbates roundoff errors. SVD and QR methods (below) avoid computing this matrix explicitly.

Right, but be careful — you should check for small singular values compared to the largest singular value. (There is no such thing as “small” in an absolute sense — it’s always important to ask “small compared to what?”)

Of course, comparing the smallest singular value to the largest singular value is the same as computing the condition number of the matrix, which you can do with cond. If cond(A) is \gg 1 then the matrix is close to rank-deficient.

(Alternatively, you can call the rank function, which counts the number of singular values that are not too small compared to the largest one, but if you just want to check that it is full rank I would simply look at the condition number.)

(Both cond and rank use the SVD.)

Yes, Julia has pivoted QR, and you can use this to estimate the rank, though in principle the SVD-based methods are more reliable. See How to find the linearly independent columns (rows) of a matrix - #14 by stevengj and rank(::QRPivoted) method? · Issue #53214 · JuliaLang/julia · GitHub.

You could also use QR to compute the condition number, since if A = QR then \mathrm{cond}(A) = \mathrm{cond}(R) in exact arithmetic, and cond(qr(A, ColumnNorm()).R) should be faster than cond(A) (via the SVD) for a very “tall” matrix.


For a very tall matrix (m \times n where m \gg n), an even faster thing should be to compute the condition number via:


since that reduces it to a small n \times n matrix (by a fast matrix multiplication) before taking the SVD (and the SVD for a Hermitian matrix is even faster, just a Hermitian eigenproblem).

Computing A'A loses you some accuracy in the ill-conditioned case, but that probably doesn’t matter if you are only using this to check whether the condition number is large.


It would also be helpful if you provided some context here. Why do you need to calculate this quickly? What do you plan to do with the answer to this question?

The fact that injective/non-injective isn’t really a binary choice given the presence of uncertainty (due to floating-point roundoff or other sources of error) makes me wonder whether you are asking the wrong question, i.e. whether you should re-think your higher-level algorithm.


I’m interested in a problem of quantum state tomography. In this context, the probability of the outcomes of an experiment are the expectation values of a set of operators called a positive operator valued measure (POVM). To be suitable for tomography, I want that every possible set of probabilities, which are determined by the considered POVM, correspond to a unique density matrix, so that, with the measured results, I can distinguish between different quantum states. A POVM with this property is said informationally complete. This is equivalent to ensuring that a certain operator A is injective.

What I actually want to do is to decide if a given POVM is informationally complete or not. Now that I think about it, I actually don’t need it to check it super fast. To check it accurately would be much more important.

Now that I read more about the condition number @stevengj mentioned, I think that it probably is the right concept to apply in this case. If I understand it correctly, in practice, if the condition number is high, any experimental errors in the determination of the probabilities will cause a huge uncertainty in the reconstruction of the state. In the case where my matrix A is not injective, the condition number tends to infinity.

1 Like

Your resoning seems corect to me, so I’m not understanding what is happening in the example bellow:

julia> A = [1 0.0 0.0 1; 1 0.0 0.0 -1]
2×4 Matrix{Float64}:
 1.0  0.0  0.0   1.0
 1.0  0.0  0.0  -1.0

julia> cond(A)

julia> cond(A' * A)

One should have cond(A) = Inf, right?

No. Because this is a “wide” matrix not a “tall” matrix, the condition number measures whether it has independent rows (it is surjective) not independent columns. i.e.

cond(A) ≈ sqrt(cond(A*A'))

in this case. The condition number is 1 because you have orthogonal rows, which is as independent as they can get.

In your original post you said you are dealing with “tall” matrices (more rows than columns), in which case the condition number measures injectivity and cond(A) ≈ sqrt(cond(A'A)).

(If you have a wide matrix, there is no need to do any computation to determine injectivity — a wide matrix is never injective, and corresponds to undetermined problems with non-unique solutions.)

More precisely, for an m \times n matrix, the SVD finds k = \min(m,n) singular values \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_k \ge 0, and the condition number is \sigma_1 / \sigma_k. That’s why the interpretation depends on whether m > n vs m < n.