LDLt factorization for full matrices

Bunch–Kaufman does not return a strictly diagonal matrix, but a tridiagonal one.

BunchKaufman{Float64, Matrix{Float64}}
D factor:
4×4 Tridiagonal{Float64, Vector{Float64}}:
 0.425595  0.0        ⋅         ⋅ 
 0.0       0.213406  0.484524   ⋅ 
  ⋅        0.484524  0.162901  0.0
  ⋅         ⋅        0.0       0.891978

Isn’t there any factorization of the form L D L^T = P^T A P?

Oh, right, though once you have the tridiagonal matrix you can run ldlt on it.

Why do you need this factorization?

Thank you @stevengj.

I have a matrix H=V \Lambda V^T (eigenvalue decomposition) and I need to solve systems of the form \overline{H} x = y, with \overline{H} = V |\Lambda| V^T. At the beginning I was using the spectral decomposition, but I have problems with some ill-conditioned matrices. I was wondering if using the factorization P^T\overline{H} P=L|D|L^T (in theory, this is the same as P^T V |\Lambda| V^T P) would be more robust and efficient.

Anyway, for completeness, I think there should be an L D L^T factorization for full matrices (unless this factorization is useless), as happens with the LU factorization.

If you don’t care about stability*, just run lu without pivoting and extract the diagonal D from the diagonal of U. If it’s symmetric (and no pivoting is used), then U = DL'.

*This is probably why such a method doesn’t exist in LinearAlgebra.

It might be nice to have an abs(A::AbstractMatrix) that computes |A| accurately and efficiently, but I’m not sure if there is standard definition for non-Hermitian matrices? Is it |A| = \sqrt{A^*A} for any square A? In what kind of application is this function used?

But it sounds like you can use a combination of bunchkaufman and ldlt in your case.

The polar decomposition (which yields |A|) is available in the MatrixFactorizations package.

Thank you for your help and sorry for my late reply (I was out this weekend).

Numerical stability is important in this case, so I need methods that consider matrix permutations. I think the best workaround for the moment is bunchKaufmann + ldl or directly ldlt(sparse(B)).

Is there anything which matches MATLAB’s ldl()?
How does MATLAB get around the numerical issues and gets L and D for any matrix?

1 Like

https://fossies.org/linux/SuiteSparse/LDL/Doc/ldl_userguide.pdf looks interesting

1 Like

I think the solution might be some synthetic sugar around the pivoted Cholesky decomposition so one gets out a structure which is similar to the non pivoted Cholesky.

If your matrix B is symmetric is the SVD decomposition not what you want ?


No, because your matrix U must be lower triangular, not orthogonal.

1 Like
1 Like

Pay attention that doesn’t hold for zero eigen values (Namely in that case it is not necessary that U = V ). Though you can always drop the zero singular values and then for the non zero subset it will hold.

I’m late to this, but there are some things I’m not sure I understand about what you are trying to do. Given a symmetric matrix H, the goal sounds like it is to get a matrix absolute value \overline{H} (i.e. \overline{H} = (H^T H)^{1/2}, also known as the symmetric polar factor of H) so that you can solve a system of equations involving that matrix. The polar decomposition of a real, square matrix H (not necessarily symmetric) is a decomposition of the form

H = U \overline{H}

where U is orthogonal and \overline{H} is symmetric positive definite. It can be adapted to complex and non-square matrices, but I’m focusing on what looks like your case.

If this is your goal, the eigenvalue decomposition solves everything nicely; once you have computed it, you can use the eigenvalue decomposition directly to solve the system without ever forming \overline{H} explicitly. It will be backward stable, so you’re not going to do much better in terms of stability. If conditioning is an issue, you could even consider doing some sort of regularization with the spectral decomposition. If I understand your problem correctly, I think the spectral decomposition is the best thing you could use on every measure except possibly speed.

If you’re just trying to speed up computaton of a polar factor, there are more specialized methods for polar factors. (There’s a chapter on the topic of the polar decomposition in Functions of Matrices, Theory and Computation, SIAM by N. Higham.) Specifically there are iterations for polar factors that might be faster, but you do have to start worrying about convergence and stability to a greater extent. If you want to give something a try

U_{k+1} = \frac{1}{2} U_k(3I - U_k^T U_k), \qquad U_0 = H

is an iteration that should converge quadratically to U in the polar decomposition if H has eigenvalues satisfying |\lambda| < \sqrt{3}. You can scale H to make the eigenvalues satisfy the bound if needed. Once you have U you should be able to get \overline{H} = U^T H, although there could perhaps be loss of orthogonality in U and symmetry in H due to numerical errors. I don’t have any practical experience with this in terms of stability or how long it takes before quadratic convergence really kicks in, but the theory is in Higham’s book. I think it’s likely that the spectral decomposition is going to be less work and worry if you can afford it.

The part that makes me wonder if I’m misinterpreting the problem is that I’m not sure where the LDL^T factorization would enter into it. If you compute P^T H P = LDL^T, then it is not in general going to be true that the polar factor will satisfy \overline{H} = P L |D| L^T P^T. So LDL^T doesn’t help with the polar factor; LDL^T is pretty useless for that. I suppose you could consider computing \overline{H} and computing its LDL^T decomposition (actually, since it’s symmetric positive definite, it has a Cholesky decomposition). But that’s not going to be any more stable than the spectral decomposition you used to get \overline{H} would be. If you go with an iteration to get \overline{H}, using Cholesky would be natural at that point.

But maybe I just wrote a lot without understanding what you want…


Thanks @mstewart for your detailed answer, and sorry for the late reply.

I need to modify the Hessian in a Newton optimization algorithm. I need the modified Hessian to be positive definite to guarantee that the search direction is descendent.

The first option was \overline{H}=V|\Lambda|V^T, but it wasn’t performing well. So I thought about \widehat{H}=P L |D| L^T P^T, where D is strictly diagonal. (I think I have seen somewhere that \overline{H}=P^T\widehat{H}P are equal, but I am not sure either.) And then I saw that in Julia there was LDL^t support for sparse matrices, but not for full ones. That’s why I opened this post.

Nevertheless, computing \widehat{H} is not the best option to modify the Hessian for Newton optimization algorithms. After reading the post pointed by @zdenek_hurak, Shouldn’t bunchkaufman be actually named ldlt?, and Section 3.4 in [Nocedal and Wright, Numerical Optimization, 2006], I realize that LDL^t factorization does not seem stable, and that Bunch-Kaufman H=P L B L^T P^T is the appropriate one. Since B is block-diagonal with block size \leq 2, it is easy to compute its eigenvalue decomposition and modify it so that the Hessian is positive definite.

So now my doubts are:

-Is is true that H=P L D L^T P^T factorization is unstable for full matrices? If not, why using the Bunch–Kaufman factorization instead? I also think there should be a function to compute the LDL^T factorization for full matrices if the latter is stable.

-In the case of sparse matrices, what does ldlt compute? Is it the strict LDL^T factorization or the Bunch–Kaufman? (In the help it just says it uses CHOLMOD.) If the first, isn’t it unstable? If the second, I’d find confusing to name the same factorization in a different way depending on if the matrix is full or sparse.

-If I execute F=ldlt(A), with A some sparse symmetric matrix, is there a way to display D and/or L? Because F.D, Matrix(F.D), sparse(F.D), etc., does not work.

I definitely didn’t have a clear idea of what you were trying to do. Knowing that you are modifying a Hessian helps. If I’ve got it right now, it sounds like you’ve tried modifying the Hessian by flipping signs of negative eigenvalues, which left you with ill-conditioned matrices and are now aiming for the procedure described in section 3.4 under the heading “Modified Symmetric Indefinite Factorization”? If ill conditioning was the problem before that seems somewhat expected from the strategy you were using to modify the Hessian. You should be able to increase the problematic eigenvalues to something somewhat small (but not too small) as in equation (3.41) in Nocedal and Wright (2nd edition) and have as good convergence as you will get with anything else. In that equation, you can pick \delta to avoid ill conditioning. The modification based on LDL^T (in some form) should be faster on a per iteration basis and you might not do much (or any) worse on convergence. But there is some risk in using LDL^T of computing a modification of the Hessian that is larger than you need, which could slow convergence a bit. Pivoting potentially matters both for stability and in the size of the Hessian modification you will get out at the end.

To answer your question about stability: I’m a little thrown off on the wording. I’m guessing when you write H=PLDL^TP^T you mean for D to be diagonal. In this context, it’s somewhat common for D to represent a block diagonal with 2\times 2 blocks. I would normally read it that way, but I suspect that’s not what you mean. Here is the situation:

If A is symmetric positive definite, you have A = LDL^T where D is an ordinary diagonal (i.e. 1\times 1 “blocks”). This can be computed in a stable manner without pivoting. You can also just do Cholesky in this case.

If you are looking for exactly diagonal D for an indefinite matrix, it’s a bit worse than unstable. The decomposition you are looking for does not even exist in general for symmetric indefinite A, even with symmetric pivoting. The standard example is

A = \left[ \begin{array}{cc} 0 & 1 \\ 1 & 0 \end{array} \right]

It has nothing to do with sparsity, since this matrix could of course be stored in a sparse format. It’s also tridiagonal, so being tridiagonal does not help. To see why this doesn’t work, first note that there are only two possible permutations P=I and P=A, both of which leave P^T A P = A. So you can ignore permutations. If you try to write

A = \left[ \begin{array}{cc} 1 & 0 \\ l_1 & 1 \end{array} \right] \left[ \begin{array}{cc} d_1 & 0 \\ 0 & d_2 \end{array} \right] \left[ \begin{array}{cc} 1 & l_1 \\ 0 & 1 \end{array} \right]

and multiply that out, you quickly find that d_1=d_2 = 0 which clearly doesn’t work. Hence the decomposition simply doesn’t exist. Another way to look at it is that if you were to look at scalar (1\times 1) pivots, your only choices with a symmetric P^T AP pivoting approach are the zeros on the diagonal of A. You have to either give up symmetric pivoting (and perhaps compute P^T A = LU) or accept that you will have some 2\times 2 blocks on the diagonal. Even when you don’t have exactly zero pivots on the diagonal, small pivots cause numerical problems, as with LU factorization. So you need some pivoting strategy that is symmetric and avoids small scalar pivots, not just the zero pivots. Bunch-Kaufman is one such strategy that achieves that by using 2\times 2 pivots when they are a better choice numerically than a scalar pivot. (Bunch-Parlett pivoting is another somewhat stronger but somewhat slower strategy; it is analogous to the relation between partial and complete pivoting in standard LU factorization). If you want a method that preserves symmetry and applies to an arbitrary indefinite symmetric matrix and is stable, you are going to need pivoting and will have to put up with 2\times 2 blocks on the diagonal.

In terms of what’s in Julia, I haven’t used these routines in Julia before. But I tried the following:

using LinearAlgebra
using SparseArrays

A=[0.0 1.0; 1.0 0.0]
#F1=ldlt(sparse([0.0 1.0; 1.0 0.0]))
#F2=ldlt(SymTridiagonal([0.0, 0.0], [1.0])) 
F3=bunchkaufman([0.0 1.0; 1.0 0.0])

I have commented out the ldlt calls because they both throw zero pivot exceptions. The sparse version of ldlt calls SuiteSparse which is a little hard to dig into. (Presumably it is pivoting to minimize fill-in and not worrying about zero or small pivots). The call fo the SymTridigonal ultimately calls

function ldlt!(S::SymTridiagonal{T,V}) where {T,V}
    n = size(S,1)
    d = S.dv
    e = S.ev
    @inbounds for i in 1:n-1
        iszero(d[i]) && throw(ZeroPivotException(i))
        e[i] /= d[i]
        d[i+1] -= e[i]^2*d[i]
    return LDLt{T,SymTridiagonal{T,V}}(S)

which does no pivoting. So I think it’s safe to say that neither of the methods for ldlt take into account stability or even existence of the decomposition in the case of zero pivots. I think bunchkaufman is really your only choice here if you want something generally applicable, reliable, and readily available in Julia. Unless you want to revisit the eigenvalue approach.

I don’t know about displaying the factors if you do use ldlt on a sparse problem. It looks like an issue of how to convert a SuiteSparse representation to something you can access more directly in Julia. I don’t know how to do that and am curious about that as well. But in your case, you really shouldn’t be using routines that don’t pivot for stability, so perhaps the question is moot.

1 Like

Yes, I meant by D to a 1 \times 1 block diagonal matrix.

The example you have given to show that a PLDL^TP^T factorization may not exist is very helpful.

From the REPL help, it seems to me that ldlt does no pivoting on SymTridiagonal matrices, but it does on SparseMatrixCSC. However, this factorization is still not useful because, as you say, it may not exist — and indeed yields an error in your example. By the way, here is some help to retrieve the factors from the ldlt decomposition.

So, now it is more clear to me that bunchkaufman is the best option. However, it works only for full matrices. In the case of sparse matrices, is there any bunchkaufman factorization? In Matlab there seems to be a stable LBL^T (B=block diagonal) factorization both for dense and sparse matrices. I have not found anything similar in Julia.

I’m not on a computer where I can try Matlab, although maybe I can check it out later. Looking it over, it’s not clear to me where on that doc page it says that it will produce 2\times 2 blocks if the function is applied to a sparse indefinite matrix. Does Matlab do that when you try it? I thought Matlab used CHOLMOD, which pretty clearly says in CHOLMOD Docs that A needs to have well conditioned leading principal submatrices. That is a pretty strong hint that it isn’t pivoting for stability.

Update: I did try it on Matlab. It looks like they are using the algorithm MA57 which is in HSL. There appear to be an interface to HSL at HSL.jl. If it works as expected, it looks like you can do this and get back the separate factors to modify and enforce positive definiteness.

1 Like

Thank you very much @mstewart and others, you have helped me a lot.