Cross platform differences in `\` (and `det`)


#1

I gave a tutorial to my students on a mac and showed that Julia gives a reasonable error message for a singular system:

julia> versioninfo()
Julia Version 0.6.0
Commit 903644385b (2017-06-19 13:05 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)

julia> A = [1 2 3; 4 5 6; 7 8 9]
3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

julia> b = [1;2;3]
3-element Array{Int64,1}:
 1
 2
 3

julia> A\b
ERROR: Base.LinAlg.SingularException(3)
Stacktrace:
 [1] A_ldiv_B! at ./linalg/lu.jl:238 [inlined]
 [2] \(::Base.LinAlg.LU{Float64,Array{Float64,2}}, ::Array{Int64,1}) at ./linalg/factorization.jl:48
 [3] \(::Array{Int64,2}, ::Array{Int64,1}) at ./linalg/generic.jl:817

But on my student’s Windows computer, he gets an answer!

julia> versioninfo()
Julia Version 0.6.0
Commit 903644385b* (2017-06-19 13:05 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, ivybridge)

julia> A = [1 2 3; 4 5 6; 7 8 9]
3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

julia> b = [1; 2; 3]
3-element Array{Int64,1}:
 1
 2
 3

julia> A\b
3-element Array{Float64,1}:
 -0.333333
  0.666667
  0.0

These are both Julia’s current release.

Why the difference?


#2

I can’t reproduce the singular exception on my mac:

julia> versioninfo()
Julia Version 0.6.0
Commit 903644385b (2017-06-19 13:05 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i7-2860QM CPU @ 2.50GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, sandybridge)

julia> A = [1 2 3; 4 5 6; 7 8 9]
3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

julia> b = [1;2;3]
3-element Array{Int64,1}:
 1
 2
 3

julia> A \ b
3-element Array{Float64,1}:
 -0.333333
  0.666667
  0.0

And indeed I have

julia> A * (A \ b) ≈ b
true

#3

But isn’t that the problem, since

julia> [1 2 3; 4 5 6; 7 8 9] * [-1,2,-1]
3-element Array{Int64,1}:
 0
 0
 0

and the singularity should be detected somewhere along the way, eg by A_ldiv_B! the latest?


#4

I can reproduce on my mac:

julia> versioninfo()
Julia Version 0.6.0
Commit 903644385b (2017-06-19 13:05 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

julia> A = [1 2 3; 4 5 6; 7 8 9]
3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

julia> b = [1;2;3]
3-element Array{Int64,1}:
 1
 2
 3

julia> A \ b
ERROR: Base.LinAlg.SingularException(3)
Stacktrace:
 [1] A_ldiv_B! at ./linalg/lu.jl:238 [inlined]
 [2] \(::Base.LinAlg.LU{Float64,Array{Float64,2}}, ::Array{Int64,1}) at ./linalg/factorization.jl:48
 [3] \(::Array{Int64,2}, ::Array{Int64,1}) at ./linalg/generic.jl:817

I assumed the problem with providing an answer is that won’t be an unique solution:

julia> x1 = [-1/3,2/3,0]
3-element Array{Float64,1}:
 -0.333333
  0.666667
  0.0

julia> x2 = [-1,2,-2/3]
3-element Array{Float64,1}:
 -1.0
  2.0
 -0.666667

julia> A*x1 ≈ A*x2 ≈ b
true

ETA: seems related to this discussion: https://github.com/JuliaLang/julia/issues/17591

@stevengj’s comment:
“qrfact(A) is more than 2x slower than lufact(A) on my machine for a 1000x1000 matrix. Can’t \ on a square matrix just switch to QR only if lufact throws a SingularException?”

Perhaps this flow is what’s going on in the Windows version? While mac simply throws the exception?

Depending on your desired output, you could call the corresponding method:

julia> qrfact(A,Val{true}) \ b
3-element Array{Float64,1}:
 -0.0555556
  0.111111
  0.277778

julia> qrfact(A) \b
3-element Array{Float64,1}:
 -0.333333
  0.666667
 -0.0

julia> lufact(A) \b
ERROR: Base.LinAlg.SingularException(3)
Stacktrace:
 [1] A_ldiv_B! at ./linalg/lu.jl:238 [inlined]
 [2] \(::Base.LinAlg.LU{Float64,Array{Float64,2}}, ::Array{Int64,1}) at ./linalg/factorization.jl:48

#5

On Windows 10, 64-bit I get:

julia> A = [1 2 3; 4 5 6; 7 8 9]
3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

julia> A\[1; 2; 3]
ERROR: Base.LinAlg.SingularException(3)
Stacktrace:
 [1] A_ldiv_B! at .\linalg\lu.jl:270 [inlined]
 [2] \ at .\linalg\factorization.jl:71 [inlined]
 [3] \(::Array{Int64,2}, ::Array{Int64,1}) at .\linalg\generic.jl:849

julia>

Sorry: forgot to say “with 0.7”. @MikaelSlevinsky, @rdeits, @JackDevine, @Tamas_Papp: could it be version diff?


#6

Interesting, on Ubuntu, with openblas I get the same singular error, but when I use MKL I get

julia> A\b
3-element Array{Float64,1}:
 -0.233333
  0.466667
  0.1     

Which is a solution, but as pointed out above, not a unique one. Then if I use MATLAB on a windows machine the I get

>> A\b
 -0.333333
  0.666667
  0.0

But it also gives a warning saying that the matrix is singular and that the result might not be accurate.


#7

Fascinating! So why do some build and platform combinations use qrfact while others use lufact? It appears to be due to rounding errors in different blas libraries manifesting themselves in lufact.


#8

Is it possible that slight differences in roundoff errors between platforms cause it to sometimes not detect that the matrix is singular? (i.e. maybe on some platforms when it does the LU factorization it doesn’t exactly get zero pivots?)


#9

That seems quite possible, I get a singular exception with openblas but not with mkl and if we compare the LU factorisations, we get

# With openblas
julia> lufact(A)
Base.LinAlg.LU{Float64,Array{Float64,2}} with factors L and U:
[1.0 0.0 0.0; 0.142857 1.0 0.0; 0.571429 0.5 1.0]
[7.0 8.0 9.0; 0.0 0.857143 1.71429; 0.0 0.0 0.0]

# With mkl
julia> lufact(A)
Base.LinAlg.LU{Float64,Array{Float64,2}} with factors L and U:
[1.0 0.0 0.0; 0.142857 1.0 0.0; 0.571429 0.5 1.0]
[7.0 8.0 9.0; 0.0 0.857143 1.71429; 0.0 0.0 -1.58603e-16]

You can see that the last elements are different due to roundoff error.


#10

Yes, there may be platform differences, even though all platforms are supposed to observe the IEEE standard for doubles and floats. By the way, the matrix as defined is integer… And so is the right-hand side.


#11

What about the linear-algebra libraries: rounding off versus truncation.


#12

Good point, I guess I was being quite lazy when I said “roundoff error”.


#13

As the manual says, integer matrices are promoted to floats for this sort of thing. Thus they are passed off to LAPACK and BLAS routines which do scalings and operator orderings which differ (in this case) between different generations of Intel processors using the same libraries. (So we get a singularity flag on Haswell, but not on Sandybridge because LAPACK only checks for an exact zero.) (Just to be specific about what stevengj suggested above.)

This is all consistent with IEEE math, but illustrates the dangers of working naively with singular or badly-conditioned matrices.