When to use SVD and when to use Eigendecomposition for PCA?

stevengj · January 16, 2019, 1:57pm

In exact arithmetic (no rounding errors etc), the SVD of A is equivalent to computing the eigenvalues and eigenvectors of AᵀA. However, computing the “covariance” matrix AᵀA squares the condition number, i.e. it doubles the number of digits that you lose to roundoff errors. Of course, if you start with 15 digits and your matrix has a condition number of 10², so that you are only losing about 2 digits, then squaring the condition number to 10⁴ still leaves you with 10+ digits. But if your matrix is badly conditioned then you could have a big problem, especially if you have inaccurate data to start with. That’s why SVD methods typically work by bidiagonalization or similar methods that avoid forming AᵀA explicitly.

However, for PCA my understanding is that there is a saving grace, because in PCA you are usually only interested in a few largest singular values. In particular (covered in e.g. lecture 31 of Trefethen and Bau Numerical Linear Algebra), computing the k-th singular value σₖ via AᵀA increases the errors (compared to bidiagonalization-based SVD) by a factor of ≈ σ₁/σₖ where σ₁ is the largest singular value. In consequence, as long as you are only interested in singular values and vectors close in magnitude to σ₁, as is typical in PCA, then working with the covariance matrix is not so bad.

See also the discussion in Use A'A instead of [0 A;A' 0] in svds. by andreasnoack · Pull Request #21701 · JuliaLang/julia · GitHub regarding the svds function, which is typically used to compute only a few largest singular values (like for PCA) and hence the AᵀA method is attractive for computational efficiency despite the potential accuracy loss.

Topic		Replies	Views
What is the best way to factorize/decompose a covariance matrix? Numerics question , linearalgebra	21	4311	March 9, 2020
Most efficient implementation of covariance matrix Performance statistics , multithreading , nerdsnipe , cov , covariance-matrix	12	885	May 28, 2024
Low rank SVD Performance question , linearalgebra	3	1507	January 18, 2021
Computing the SVD of a LinearMap General Usage linearalgebra	5	429	August 10, 2022
Julia SVD results differing from MATLAB Numerics linearalgebra	5	1050	February 2, 2021

When to use SVD and when to use Eigendecomposition for PCA?

Related topics