SVD gives different results from Python with MWE


I have a python written code, and it uses svd from numpy. And I am trying to port the same code in Julia.
The problem I am facing is that after some point, even though I read the same files for both programming languages, svd method in two languages decomposes the same matrix differently. Hence, solution changes and I cannot replicate the experiment.

I have multiple svd usages inside the code, the first two gives nearly the exactly decomposition. (Everything is in Float32) But after the third svd function call my results and the original(python) code starts to differ. The only difference from the first 2 svd functions is that the third one removes the full_matrices=False flag in Python code.

 u, s, vt = np.linalg.svd(xsim[:idx], full_matrices=False) # get the same result
 u, s, vt = np.linalg.svd(zsim[:idx], full_matrices=False) # get the same result
 u, s, vt = np.linalg.svd(z[trg_indices][src_indices])) # third call to svd; where things get messy.

On the other hand, in Julia, even thoug I changed the algorithm to QRIteration , and full to true. I am not getting the same results.

I did some research and both numpy and Julia uses LAPACK package underneath. Then what should be the correct way to get the same results ?

This is what I get when I write print(numpy.show_config())

    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

For julia could someone please help me if all set is the same as Python ?


It would be much easier to help you if you could create a MWE that is give us the specific matrices xsim[:idx], zsim[:idx], and z[trg_indices][src_indices]) such that we can reproduce the behaviour.

On a general note: Have you taken account for the fact that Julia uses column-major ordering and Python row-major ordering? That is matrices in Julia in Python might differ by a transpose operation.

1 Like

Yeah ,you are right. But those matrices are huge files nearly (one is 1.1GB and the other is 512MB). If you want I can share you the links.

Ideally, you should try to create a minimal example. For example, If you only consider the submatrix, say M[1:100,1:100], do Numpy and Julia still disagree?

Having said that, sharing a link to a large file is still better than sharing nothing :slight_smile:


Actually I am reading submatrices of these huge matrices. The first 4000 words.

Instead of posting a link to a “raw” file that then needs to get loaded and processed by a function you could make it easier for the other side and dump the final matrix, i.e. xsim[:idx] to a reasonable file format that can be read directly. For example, using NPZ.jl which is compatible with and numpy.load.

This way you can simply load the file in Julia and Python (i.e. numpy) and assure that you actually operate on precisely the same data.

(You want to abstract away your context as much as possible when creating a MWE.)

Thank didn’t know that. I will upload them.

Differ in what way? Realize that you will often get slightly different results due to roundoff errors, and there is also some arbitrariness in the choice of phase of the singular vectors (or their sign, for real matrices) that can change between versions of LAPACK/BLAS. If you are using the full (not thin) SVD, then there can also be arbitrary rotations in the nullspace portions of the basis (as well as rotations in vectors for pairs of equal singular values).


Could you please try svd with these both source and target embeddings ?

For the MWE:

using LinearAlgebra
using NPZ # for loading uploaded matrices

X = npzread("./path_to_file/src_emb.npy")
Z = npzread("./path_to_file/trg_emb.npy")

F = svd(Z' * X, full=true)
W = F.V * F.U'

W should show :

300×300 Array{Float32,2}:
 -0.136035    -0.0777119    0.0617536   0.0322536    -0.0803463   -0.0503565  …   0.095194    -0.0131295   -0.0396312    -0.00628574   0.0511491
 -0.00174201   0.0281835    0.0454273   0.0358372     0.0529457   -0.0639435     -0.0942243    0.0325997    0.000161858  -0.0514902   -0.0498046
  0.0676922    0.237949     0.0913208  -0.0106957    -0.00361676  -0.0334244     -0.0415728    0.0830503    0.0839673    -0.0716634   -0.0183218
  0.180583    -0.0958469   -0.0631438  -0.000189014   0.0190076   -0.0152651      0.0361847   -0.00151322   0.0589926     0.0360943   -0.120823
  0.0309126    0.0636542    0.0391062  -0.0415557    -0.107748     0.0503444     -0.00643703  -0.0312398    0.0658869     0.00057633  -0.0033062
  ⋮                                                                ⋮          ⋱   ⋮                                                   
 -0.00427003  -0.0422672    0.0433433   0.0792277    -0.0729017   -0.0360674      0.0137698    0.00893876   0.0135633    -0.00463451  -0.0784995
  0.0467229    0.0159696   -0.0542639  -0.0673708     0.0118304    0.0391193      0.121941    -0.041253    -0.0230913    -0.0481352   -0.0489984
  0.0685667    0.00899705  -0.0338196  -0.102029     -0.072511     0.074575       0.0204513    0.0675888    0.0863168    -0.00807602   0.00272199
  0.0906389    0.08509     -0.0440936   0.0993734     0.0830268    0.0200159      0.0228065    0.067321     0.0181914    -0.0103052   -0.0205173

Whereas python:

import numpy as np 

X = np.load('./path_to_/src_emb.npy')
Z = np.load('./path_to_/trg_emb.npy')

u,s,vt = np.linalg.svd(

w =

w should show

array([[ 0.02077192, -0.09222455,  0.16809827, ..., -0.07213789,
         0.01867277, -0.03105616],
       [-0.0070491 , -0.04300215,  0.02156948, ...,  0.08834429,
        -0.01833235, -0.01745731],
       [ 0.08617274,  0.15471898,  0.01507143, ...,  0.01034247,
        -0.11459561, -0.01204915],
       [ 0.00199408,  0.01696967, -0.04052072, ...,  0.00589352,
         0.01136548, -0.05689107],
       [ 0.01693112, -0.05401053, -0.03806448, ...,  0.01929145,
        -0.03649945,  0.03389705],
       [ 0.12757824,  0.05976016, -0.03483151, ...,  0.04901506,
        -0.09714089, -0.04803994]], dtype=float32)

Yes, I understand and expect that. But after decomposing the mutliplication of both matrices, all matrices U, V start similar but they are quite dissimilar at later column space. I am not claiming anything and do not know much about how things work under the hood. I am just wondering how to make both programs give the same (similar) results.

Well, the results are certainly similar:

using NPZ, PyCall, Statistics, LinearAlgebra
np = pyimport("numpy");

adiff(A,B) = maximum(abs.(abs.(A) .- abs.(B)))

mat_src = npzread("src_emb.npy"); # Float32 data
U, S, Vt = np.linalg.svd(mat_src, full_matrices=false)
F = svd(mat_src);
display((adiff(F.U, U), adiff(F.S, S), adiff(F.Vt, Vt)))
display((abs.(F.U) ≈ abs.(U), F.S ≈ S, abs.(F.Vt) ≈ abs.(Vt)))

mat_trg = npzread("trg_emb.npy"); # Float32 data
U, S, Vt = np.linalg.svd(mat_trg, full_matrices=false)
F = svd(mat_trg);
display((adiff(F.U, U), adiff(F.S, S), adiff(F.Vt, Vt)))
display((abs.(F.U) ≈ abs.(U), F.S ≈ S, abs.(F.Vt) ≈ abs.(Vt)))

I get:

(0.0002946444f0, 9.1552734f-5, 0.0007101372f0)
(false, true, false)

(4.1430816f-5, 0.00021362305f0, 0.00010115653f0)
(true, true, true)

Thank you for your response.

In your sample, you decomposed individual matrices. And I assume you applied svd on individual matrices in order to show that the difference is so small that it does not matter if there is a multiplication operation (or another op.) between the matrices.
And as you are trying to explain the differences between decomposed values are really small which are negligible.

But when I do for the multiplication of two matrices the result is :

julia> display((adiff(F.U, U), adiff(F.S, S), adiff(F.Vt, Vt)))
(0.47538608f0, 0.0078125f0, 0.6747172f0)

julia> display((abs.(F.U) ≈ abs.(U), F.S ≈ S, abs.(F.Vt) ≈ abs.(Vt)))
(false, true, false)

For the singular values (S and F.S), the difference again is very small. But cannot say the same for the orthogonal matrices U, V .

Could you please clarify what should I do ?
Thank you in advance

I think the basic problem here is that this W is an ill-conditioned function of A, so it will be extremely sensitive to small differences in roundoff errors (in any language).

It correspond to taking A^T = V \Sigma U^T and then forming a new matrix W = A^T U \Sigma^{-1} U^T = VU^T. This is a problem because whenever A has small singular values the matrix U \Sigma^{-1} U^T that you are (implicitly) multiplying by is ill-conditioned. (In your example, the 300\times 300 matrix A has numerical rank \approx 133 — its near-zero singular values are basically numerical noise that you are amplifying to unity).

A symptom of this is that, even in Python, you get a completely different W matrix if you do the computation in double vs. single precision:

import numpy as np 

X = np.load('src_emb.npy')
Z = np.load('trg_emb.npy')

u,s,vt = np.linalg.svd(
w =

u64,s64,vt64 = np.linalg.svd(np.float64(Z)
w64 =

np.linalg.norm(w - w64, 2)

gives 1.99999... (i.e. roughly 2, which is what you would expect for the operator norm of the difference between two random unitary matrices).

You may need to re-think the computation you are performing if it depends on this matrix. (In some cases the final result may be okay if the algorithm is sufficiently clever/careful, even if intermediate matrices like W vary wildly.)


You may need to re-think the computation you are performing if it depends on this matrix. (In some cases the final result may be okay if the algorithm is sufficiently clever/careful, even if intermediate matrices like W vary wildly.)

svd occurs inside in a loop and since W is ill-conditioned (didn’t know that) 2 programs start to differ immediately. The one in python optimizes the result much better where as the Juila code stucks and finishes early.

Thank you very much for your time and clean explanation.

Cheers from Istanbul

PS. You can also pass alg=LinearAlgebra.QRIteration() to Julia’s svd function to use a slower (but slightly more accurate) algorithm, in case that helps. I would still be worried about relying on W=VU^* if you want a robust algorithm, however.


I have tried QRIteration() but nothing changed.

Thank you

Having exhausted other options, you can try to reverse engineer Numpy to see if there are preprocessing the matrix somehow:

It looks like they use a subset of LAPACK processed through f2c in the lapack_lite package.