# Lanczos tridiagonalization algorithm in Julia?

Hi everyone!

I have been looking for an implementation of Lanczos algorithm in Julia, but to my surprise I haven’t found anything obvious!

I need the actual tridiagonalization, i.e. the tridiagonal matrix T and the unitary/orthogonal matrix V (in Wikipedia’s notation).
It seems that this tridiagonalization algorithm is used internally in ARPACK routines, but I cannot find a function that is exposed to the user… is it really such a niche operation?

I hope this question is OK here… I haven’t posted or used Julia seriously in quite a long time…

Do you have a dense or sparse matrix? How big?

If you have a huge sparse matrix (or some other huge matrix represented implicitly somehow), then it doesn’t make sense to compute the whole unitary matrix V, because this is a dense matrix that you wouldn’t be able to store, not to mention the cost is \Theta(n^3) in the matrix size n. Krylov iterative methods like Lanczos are designed in practice to compute only a part of the tridiagonalization, typically to span only a small number of extremal eigenvalues.

If you have a dense/moderate-size Hermitian matrix, on the other hand, and you want the unitary reduction to real-symmetric tridiagonal form, then what you want is called the “Hessenberg factorization” and there are better algorithms than Lanczos, typically using Householder reflectors. In Julia, you can call hessenberg(Hermitian(A)) in the LinearAlgebra standard library, which uses the underlying LAPACK routines for this factorization.

(What is your application, by the way?)

4 Likes

Thanks a lot, this is extremely helpful!

The matrix will be very sparse and the size roughly 10^4 x 10^4 (but I’m still in the early stages of this project and it might change). However, I think I will need to store the full unitary matrix V to go back to the original basis at the end of the subsequent calculations, so it seems that the Hessenberg factorization is probably what I need! I’m very grateful for pointing me in the right direction :).
I’m actually surprised that I haven’t found Hessenberg factorization by googling “tridiagonalization Julia”.

The reason why I am looking to tridiagonalize a matrix is related to a technique to simulate open quantum systems. Roughly speaking, the idea is to pass from a “star” configuration (all the bath oscillators interacting with the central “main” system) to a “chain” configuration (the system interacts with the first oscilator which in turn interacts with the next one and so on), which makes simulating the time-evolution of the system easier. In this context, I found Lanczos’ algorithm mentioned in this paper.

Note that if you don’t store the full matrix V, then in principle you could use Lanczos, but in practice it won’t work — if you try to get the whole tridiagonal matrix with the Lanczos algorithm (without storing the V basis and doing explicit re-orthogonalization), then roundoff errors spoil the algorithm through a well-known phenomenon called “ghost eigenvalues”.

So, basically, there is a good reason why the various Lanczos implementations in Julia (or elsewhere) don’t provide this.

In any case, in linear algebra we generally try to distinguish the matrix factorization (e.g. Hessenberg, LU, QR, …) from the algorithm used to compute it (which varies). Hessenberg is one of the least known factorizations — while it has its uses, it most commonly appears as a hidden intermediate step before diagonalization in eigenvalue algorithms.

Thanks also for the explanation of technical details about Lanczos algorithm’s problem!

It also makes perfect sense to separate the desired matrix factorization from the underlying algorithm and Hessenberg is indeed what I need for this problem, thanks again.

This is probably not the right place for this question/comment, but it confused me quite a bit.
I was a bit suprised that the Hessenberg factorization changes order depending on wether I explicitly classify a matrix as Hermitian/Symmetric or not.
For example, the code

using LinearAlgebra
n=3
A=Hermitian(randn(n,n)+im*randn(n,n))
Fherm=hessenberg(A)
F=hessenberg(Array(A))


gives me the following matrix Fherm.Q

 -0.470896+0.727489im  0.102718-0.488329im  0.0+0.0im
-0.393331+0.307094im  -0.44611+0.742947im  0.0+0.0im
0.0+0.0im            0.0+0.0im       1.0+0.0im


but the following F.Q

 1.0+0.0im        0.0+0.0im             0.0+0.0im
0.0+0.0im  -0.820389-0.112762im   0.521083-0.206684im
0.0+0.0im   -0.11539-0.548572im  -0.546988-0.621738im


Of course both factorizations are correct, but for the hermitian version the element that remains identical between the original A and the corresponding tridiagonal matrix is (N,N) instead of (1,1).
Is there a good reason for this behaviour? It’s a very minor thing, but I would prefer the behaviour to be consistent, it took me a while to figure out this was the reason of a bug in my code…

In general, Julia will use different algorithms to compute things depending on the types of your data. It’s using a different algorithm for hermitian that is about 2x faster, but which gives a different (valid) factorization. (actual speedup might be a little less than 2x, but I’m seeing 50% speedup for n=100 and I would expect it to grow as n gets bigger).

Thanks! Yes I understand this, but to my naive eye, it just seems that one algorithm starts an iteration from the opposite end and I don’t see a good reason to have things this way.
But of course you are right: nowhere in the documentation any additional property of the factorization is guaranteed, so since they are both valid factorizations I’m just asking too much.

That sounds totally possible, and if it’s easily fixable without costing performance, it would be a good PR.

We are calling LAPACK routines for all of these Hessenberg factorizations (one function dgehrd/zgehrd for general real/complex matrices, a different function dsytrd/zhetrd for tridiagonal reduction of real/complex Hermitian matrices). So it’s not easy to change the algorithms.

I think this is a result of the fact that you are working with a Hermitian matrix defined by the upper triangular part of A, which is the default for Hermitian. This then forces the LAPACK routine zhetrd to work with the upper triangular part of A. In terms of the elements in the upper triangular part of A, the decomposition you want corresponds to computing a Householder transformations to introduce zeros into the rows, starting with the first and working down to the last. However, since A is in column major storage, the elements in those rows are not stored sequentially, so that is not efficient. In this case, LAPACK does the more efficient thing by starting with the last column and working back to the first.

If you pass in a Hermitian matrix stored in the lower triangular part of A, zhetrd does something different: Since the elements in the columns are sequential, it can efficiently start at the first column and move right. This is mathematically equivalent to starting in the first row and working down. Julia does accommodate storage of hermitian matrices in the lower triangular part. If you do

A=Hermitian(randn(n,n)+im*randn(n,n), :L)


I think you will get the decomposition you want. It seemed to work for me.

1 Like

Thanks a lot of unpacking this mystery for me! I confirm that using the :L option does the trick. To be honest, I wasn’t even aware there were different methods to store an Hermitian matrix, but clearly knowing what’s under the hood and not using Julia as a black box pays off!

Since this isn’t documented either in Julia or LAPACK, I wouldn’t rely on it, e.g. it may change in a future LAPACK version.

It’s much better to accept that things like Hessenberg factorizations, eigenvectors, SVDs, etcetera are not unique — the vectors (e.g. columns of Q) can be scaled by any e^{i\phi} in the complex case, for example. If your code depends on the linear-algebra routine yielding a certain phase \phi, then it is a bug in your code.

(This is not just Julia. The same applies to using such a factorization in any language — you shouldn’t depend on undocumented implementation details that are independent of the mathematical definition of the factorization.)

I definitely agree. I wouldn’t want to sound like I was recommending depending on anything other than a backward stable factorization for anything other than throw-away code. That was definitely not my intent.

However, in the unlikely event someone really is interested in having some degree of uniqueness from a Householder Hessenberg reduction, it’s interesting to note that the LAPACK documentation does offer some helpful guarantees beyond just a stable factorization. In the “further details” part of the documentation for zhetrd, the sequence of Householder transformations for a matrix stored in the lower triangular part is explicitly given. This structure implies that the first column of Q is e_1. It’s also documented that the off diagonal elements are real. So if the tridiagonal is unreduced and the off-diagonals are positive, then the implicit Q theorem implies that the decomposition is unique. Unfortunately it’s not guaranteed in the documentation that the off-diagonals are non-negative. But this is easily achieved by fixing the phase with a diagonal similarity. So the phase issue you mention is really the only source of nonuniqueness in the unreduced case. At that point, with one very fast extra step, you have a unique decomposition in the unreduced case, with the properties that determine uniqueness being fully documented. Similar comments apply to zgehrd.

Of course, all bets are off for reduced tridiagonals/Hessenberg matrices after you hit that first zero on the subdiagonal. And, by way of further caveats, I would assume it still counts as undocumented what Julia passes to the LAPACK routines, or even if it uses LAPACK. And mathematical uniqueness might not mean much if the decomposition is ill-conditioned. And finally, I’m not sure of that many reasons to worry about fixing the first column of Q in a full Householder reduction to Hessenberg form other than if you want something like controller-Hessenberg form, in which case zgehrd is clearly not the right thing to call. So this is all kind of academic, I suppose.

Dear @stevengj and @mstewart , thanks a lot for the deep discussion and insight into this numerical linear algebra perspective, all very useful for a novice like me!

What do you mean by “controller-Hessenberg form”? Anyway, for me it’s not academic. In my particular physics application it’s actually essential that the first column/row of Q is e_1 and that’s probably why in the paper I’ve mentioned they suggest to apply Lanczos algorithm instead of a Hessenberg factorization. What other functions would you call to achieve this in a more “deterministic” manner?
Thanks again for all the insight you are sharing with me!

Oops. I had sort of missed that you needed a particular first column of Q. It had looked more like you were just surprised by the inconsistency. I’m a numerical analyst, not a physicist, so I’m probably living dangerously by trying to pull anything out of the paper you linked to, but it seemed to be using Lanczos for computing quadrature rules. Normally you truncate that early, and work with a tridiagonal that is much smaller than the original matrix. (Truncation of this type is mentioned on page 7 of the paper.) In that case Lanczos is the standard tool. And Lanczos does address any concerns about the first column of Q; you get to choose that.

However I have the impression you need the full basis for reasons that go beyond computing a quadrature rule? In that case, going with a full tridiagonal reduction does give you the full basis, the full tridiagonal matrix, and with better stability as well. But the first column is tricky. In most applications where you want the full decomposition, you commonly don’t really care about the first column of Q. Having it be a multiple of e_1 does happen naturally in most implementations, but it’s not usually something anyone tries to guarantee. Hence the behavior you observed.

As @stevengj mentioned, in doing this with hessenberg you would be depending on undocumented behavior, which is a bad idea if you want to use your code over the long term or even distribute it. LAPACK does document the structure of the decomposition clearly enough that you can infer (even though they don’t state it explicitly) that the first column is e_1 if you use the general (nonhermitian) routine or the hermitian routine with the matrix stored in the lower triangular part of A. The catch is that those details don’t seem to be similarly documented on the Julia side. Other than the fact that it is mathematically a Hessenberg factorization, what is guaranteed seems to be just that hessenberg returns a Hessenberg object that contains a HessenbergQ object. The internal structure of the HessenbergQ object is not specified. So, while LAPACK is safe to use this way, Julia could potentially stop using LAPACK for this and return something different for HessenbergQ.

Controller-Hessenberg form is a joint decomposition of two matrices B and A where A is square and B has the same number of rows as A. In the most basic case, B would be a single column b. The decomposition takes the form

\left[ \begin{array}{cc} Q^* b & Q^* A Q \end{array} \right] = \left[ \begin{array}{cc} \alpha e_1 & H \end{array} \right]

where Q is unitary and H is Hessenber. From Q^* b = \alpha e_1 we get Q e_1 = b/\alpha. So up to scaling, you can make the first column of Q whatever you want. If \|b\|_2 = 1, then |\alpha|=1 and a rescaling of the columns of Q can guarantee \alpha=1. This gets used in linear control system design. It’s basically a Householder-based Hessenberg reduction where you get to choose the direction of the first column of Q. If this were already available in Julia, it might be an easy guarantee for that first column. But maybe that would be overkill if LAPACK already guarantees that. And I don’t see any obvious libraries that provide this in Julia anyway.

If you really need a full decomposition with a hard long-term guarantee that the first column of Q will be e_1, it’s kind of tricky. If it’s throw-away code, I personally would just do the easy thing and go with the undocumented behavior. If you need something that might be a little more stable, you might call LAPACK through the interface documented at LAPACK-functions. The docs only describe the non-hermitian functions: gehrd! for computing the Hessenberg reduction and orghr! for forming Q from Householder vectors. But you could use those with a little bit of a slowdown in the initial decomposition. Forming Q should be the same speed. The hermitian functions are available but not listed in the docs. And the docs do warn that the LAPACK API is not stable. So maybe that is of no real benefit, trading an undocumented property for something that is documented but comes with a warning that it might change. You could try doing your own LAPACK interface, but that’s going to depend on low level FFI stuff, so maybe that’s not stable either; I haven’t used the Julia FFI before, so I don’t know about that. Also, if you haven’t used LAPACK directly before, calling those functions can be error prone and is not very enjoyable; you really need to know the underlying algorithms to make sense of what’s going on. Finally, if you have done this sort of thing before, it’s not hard to throw together a rudimentary version of the algorithm from scratch in Julia. But you are going to take a very big hit in terms of speed, just from losing the blocking and corresponding BLAS operations that LAPACK does, even if you do everything right in terms of writing otherwise efficient Julia code.

It does seem like this falls between the cracks in terms of what’s available in Julia in a stable, documented way. If you really need it, it might be a choice of a lesser of evils.

1 Like

Wow, thanks a lot for your detailed reply, I really appreciate it. Your comment is very useful to put into focus what I am trying to do at the moment.

To be honest I am still in an early phase of exploration and learning, so I am trying to understand the properties of the tridiagonal (“chain” in the physics picture) Hamiltonian versus the original “star” configuration. I wanted to keep the full unitary transformation to have everything under control. So I think that for the moment it’s OK to just use undocumented properties of the Hessenberg factorization in Julia / LAPACK.

However, you are completely right, in the end the idea is exactly to truncate early and simulate the evolution of the smaller truncated system. So I guess in the end I will actually need Lacnzos and not a full Hessenberg factorization.
Since you have been so kind, I would ask you if you can point me into the right direction to learn more about how Lanczos algorithm is used to compute quadrature rules and possibly to relevant Julia packages and resources. And don’t worry, even if you are not able to spend more time helping me, I’m already extremely grateful 1 Like

I’m very happy to help. That sounds like a reasonable plan. If it’s not too slow, if you’re happy with the starting vector b=e_1, and also happy that you get that starting vector at the moment from hessenberg (but maybe won’t get it later) there’s probably no pressing reason to change what you are doing at the moment.

I see that @stevengj gave a link to QuadGK.jl, which I don’t think I would have found. I’m comfortable with these algorithms, but I haven’t been a Julia user for all that long. In terms of references, the paper by Golub and Welsch referenced in the paper you are looking at is the classic paper on computing Gauss quadrature rules in this way. There is a book
Matrices, Moments, and Quadrature by Golub and Meurant that goes into all the connections with Lanczos and applications; it should be the most comprehensive thing you could look at. There are probably some other papers/tech reports from those authors on this topic that you could find on the web. Numerical Linear Algebra by Trefethen and Bau is a very nice and readable intro text to numerical linear algebra that has a section on this (referenced in the comments from the QuadGK.jl routine.) This is a very nice numerical analysis topic to learn about. It ties together a bunch of things you wouldn’t immediately think would be connected.

1 Like

Thanks a lot again to both of you, I’ve learned a lot!

In the end the clever way to perform the final open quantum system simulation is to compute the recursion coefficients \alpha_k and \beta_k of the orthogonal polynomials for an arbitrary weight function, instead of discretizing the weight function “by hand” and apply a tridiagonalization to the resulting matrix, as I was trying to do initially.
Luckily, I found this Julia package that already does it for me: https://timueh.github.io/PolyChaos.jl/ (interestingly, it uses the Stieltjes procedure by default and not Lanczos).