Weird: GitHub CI compute opposite directions in the output of my PCA on some commits on Julia v1.7.3

I have an eigenvalues-decomposition based PCA function in my lib, and I test it with:

X = [1 8; 4.5 5.5; 9.5 0.5]
out = pcatest(X;K=2)
@test isapprox(out.X,[-4.58465   6.63182;-0.308999  7.09961; 6.75092   6.70262],atol=0.00001)

Now, I haven’t touched anything in that area of the lib for years, but on some commits using the latest 1.X Julia version I have now the CI failing with:

 ** Testing pca()...
Test Failed at /home/runner/work/BetaML.jl/BetaML.jl/test/Utils_tests.jl:227
  Expression: isapprox(out.X, [-4.58465 6.63182; -0.308999 7.09961; 6.75092 6.70262], atol = 1.0e-5)
   Evaluated: isapprox([4.584647274835241 -6.631817953272375; 0.3089991577646427 -7.0996140402490004; -6.750915650181362 -6.702621717219043], [-4.58465 6.63182; -0.308999 7.09961; 6.75092 6.70262]; atol = 1.0e-5)

I don’t remember almost anything of PCA, but I feel there could be different conventions in computing the eigenvalues, so the opposite signs don’t really matter… but I feel weird that there isn’t one that is followed consistently but results are (apparently) random… Is there a way I can enforce one, so I can test the results?
Is something changed in Julia 1.7.3 (the version to which GitHub expand Julia v1) compared to 1.7.2 that can explain this?

EDIT: I confirm this started about ~20 days ago when GitHub CI started to expand v1 to 1.7.3…

I get the same output on both 1.7.2 and 1.7.3, this is how I ran it on both.

julia> using LinearAlgebra

julia> function pca(X;K=nothing,error=0.05)
           # debug
           #X = [1 10 100; 1.1 15 120; 0.95 23 90; 0.99 17 120; 1.05 8 90; 1.1 12 95]
           #K = nothing
           #error=0.05

           (N,D) = size(X)
           if !isnothing(K) && K > D
               @error("The parameter K must be ≤ D")
           end
           Σ = (1/N) * X'*(I-(1/N)*ones(N)*ones(N)')*X
           E = eigen(Σ) # eigenvalues are ordered from the smallest to the largest
           totVar  = sum(E.values)
           explVarByDim = [sum(E.values[D-k+1:end])/totVar for k in 1:D]
           propVarExplained = 0.0
           if K == nothing
               for k in 1:D
                   if explVarByDim[k] >= (1 - error)
                       propVarExplained  = explVarByDim[k]
                       K                 = k
                       break
                   end
               end
           else
               propVarExplained = explVarByDim[K]
           end

           P = E.vectors[:,end:-1:D-K+1] # bug corrected 2/9/2021

           return (X=X*P,K=K,error=1-propVarExplained,P=P,explVarByDim=explVarByDim)
       end
pca (generic function with 1 method)

julia> X = [1 8; 4.5 5.5; 9.5 0.5]
3×2 Matrix{Float64}:
 1.0  8.0
 4.5  5.5
 9.5  0.5

julia> pca(X;K=2)
(X = [-4.584647274835242 6.6318179532723756; -0.30899915776464276 7.0996140402490004; 6.750915650181363 6.702621717219042], K = 2, error = 0.0, P = [0.7456907130975964 0.6662922484916046; -0.6662922484916048 0.7456907130975964], explVarByDim = [0.9980637093577439, 1.0])
1 Like

Yes, I have installed locally Julia v1.7.3 and can’t replicate it (both running the function in isolation or testing the whole BetaML package)… I don’t know why GitHub sometimes get that results, it’s really weird (and scary)…

This is probably just different versions of BLAS being used.

1 Like

Just simply looking at eigen does not reveal any major work done on it between 1.7.2 and 1.7.3. Could be something under the hood in BLAS

Eigenvectors are indeterminate up to multiplication by an arbitrary nonzero scalar: If Av=\lambda v, then A (cv) = \lambda (cv). It probably best if you design your tests to factor out this indeterminacy. You could normalize the eigenvector to unit length and fix the sign by requiring the first nonzero component to be positive.

3 Likes