SVD gives different results from Python with MWE

I think the basic problem here is that this W is an ill-conditioned function of A, so it will be extremely sensitive to small differences in roundoff errors (in any language).

It correspond to taking A^T = V \Sigma U^T and then forming a new matrix W = A^T U \Sigma^{-1} U^T = VU^T. This is a problem because whenever A has small singular values the matrix U \Sigma^{-1} U^T that you are (implicitly) multiplying by is ill-conditioned. (In your example, the 300\times 300 matrix A has numerical rank \approx 133 — its near-zero singular values are basically numerical noise that you are amplifying to unity).

A symptom of this is that, even in Python, you get a completely different W matrix if you do the computation in double vs. single precision:

import numpy as np 

X = np.load('src_emb.npy')
Z = np.load('trg_emb.npy')

u,s,vt = np.linalg.svd(Z.T.dot(X))
w = vt.T.dot(u.T)

u64,s64,vt64 = np.linalg.svd(np.float64(Z).T.dot(np.float64(X)))
w64 = vt64.T.dot(u64.T)

np.linalg.norm(w - w64, 2)

gives 1.99999... (i.e. roughly 2, which is what you would expect for the operator norm of the difference between two random unitary matrices).

You may need to re-think the computation you are performing if it depends on this matrix. (In some cases the final result may be okay if the algorithm is sufficiently clever/careful, even if intermediate matrices like W vary wildly.)

6 Likes