Troubleshooting LAPACKException

question

#1

I am getting a weird error that occurs unpredictably:

LoadError: Base.LinAlg.LAPACKException(1)
while loading D:\Jeremy\Dropbox\Documents\School\Waterloo\Research\Programming\Julia\Schrodinger\examples\punchout.jl, in expression starting on line 188
 in chklapackerror(::Int64) at lapack.jl:32
 in syevr!(::Char, ::Char, ::Char, ::Array{Complex{Float64},2}, ::Float64, ::Float64, ::Int64, ::Int64, ::Float64) at lapack.jl:5003
 in #eigfact!#22(::Bool, ::Bool, ::Function, ::Array{Complex{Float64},2}) at eigen.jl:54
 in (::Base.LinAlg.#kw##eigfact!)(::Array{Any,1}, ::Base.LinAlg.#eigfact!, ::Array{Complex{Float64},2}) at <missing>:0
 in #eigfact#23(::Bool, ::Bool, ::Function, ::Array{Complex{Float64},2}) at eigen.jl:80
 in (::Base.LinAlg.#kw##eigfact)(::Array{Any,1}, ::Base.LinAlg.#eigfact, ::Array{Complex{Float64},2}) at <missing>:0
 in #eig#24 at eigen.jl:85 [inlined]
 in eig at eigen.jl:85 [inlined]
 in expim(::Array{Complex{Float64},2}) at utils.jl:18
 in LindbladProp(::Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2}, ::Array{Any,1}, ::Array{Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},1}, ::Float64, ::Int64) at constructors.jl:153
 in LindbladProp(::Array{Any,1}, ::Array{Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},1}, ::Float64, ::Int64) at constructors.jl:160
 in prop_gen(::Float64, ::Float64, ::Int64, ::Int64, ::Int64, ::Float64, ::Int64) at punchout.jl:33
 in (::##15#16{Schrodinger.Density{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Complex{Float64},Int64},2}})(::Tuple{Float64,Float64}) at punchout.jl:59
 in collect_to!(::Array{Schrodinger.Result{Schrodinger.Density{Array{Complex{Float64},2},2},Symbol},2}, ::Base.Generator{Base.Prod2{Array{Float64,1},LinSpace{Float64}},##15#16{Schrodinger.Density{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Complex{Float64},Int64},2}}}, ::Int64, ::Tuple{Int64,Int64,Nullable{Float64},Bool}) at array.jl:340
 in collect_to_with_first!(::Array{Schrodinger.Result{Schrodinger.Density{Array{Complex{Float64},2},2},Symbol},2}, ::Schrodinger.Result{Schrodinger.Density{Array{Complex{Float64},2},2},Symbol}, ::Base.Generator{Base.Prod2{Array{Float64,1},LinSpace{Float64}},##15#16{Schrodinger.Density{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Complex{Float64},Int64},2}}}, ::Tuple{Int64,Int64,Nullable{Float64},Bool}) at array.jl:327
 in collect(::Base.Generator{Base.Prod2{Array{Float64,1},LinSpace{Float64}},##15#16{Schrodinger.Density{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Float64,Int64},2},Schrodinger.Operator{SparseMatrixCSC{Complex{Float64},Int64},2}}}) at array.jl:308
 in run_steady(::Int64, ::Int64) at punchout.jl:57
 in include_string(::String, ::String) at loading.jl:441
 in include_string(::Module, ::String, ::String) at eval.jl:34
 in (::Atom.##59#62{String,String})() at eval.jl:73
 in withpath(::Atom.##59#62{String,String}, ::String) at utils.jl:30
 in withpath(::Function, ::String) at eval.jl:38
 in macro expansion at eval.jl:71 [inlined]
 in (::Atom.##58#61{Dict{String,Any}})() at task.jl:60

It happens while running a long computation that is repeatedly doing the same thing (with different parameters). Is there anything in the error message that could help figure out the cause?


#2

The LAPACK routine zheevr (complex form of syevr) is returning with info=1 which is labelled an “internal error”. The most likely cause seems to be convergence failure for some eigenvalues, which the documentation says is “generally caused by unexpectedly inaccurate arithmetic”. (Symmetric eigensolvers are supposed to be very robust.) In short, your science problem became something of interest to numerical analysts and/or writers of linear algebra libraries.

I suggest that you wrap your call to eig() in a try block. You could flag the parameter set as unsolvable for now. Consider saving the matrix to disk and posting it — see if any of us can nail the problem down more precisely, or even solve it.


#3

Thanks! I managed to track down one of those matrices. Here is a dropbox link to a JLD file: https://www.dropbox.com/s/vsusswa66uc78l9/problematic_mat.jld?dl=0

Interestingly, if I make this matrix sparse and use eigs instead of eig it works…


#4

These matrices (others in a similar series) have a lot of identical eigenvalues, due to being tensored with the identity. That’s probably the cause of the instability. I’ll diagonalize them before I tensor them, which is really what I should have done from the beginning anyway.


#5

Yes, roundoff interaction with eigenvalue multiplicity may well be the cause of trouble. Oddly, eig was quite happy with your example matrix on my system - perhaps BLAS does things in a different order on my processor.

If you still have trouble after reorganizing the calculation, it might help to relax the tolerance used by the LAPACK routines. (Currently the tolerance parameter is hard-wired to -1.0 in the Julia wrappers, which is replaced by a very small value in LAPACK.)