I definitely didn’t have a clear idea of what you were trying to do. Knowing that you are modifying a Hessian helps. If I’ve got it right now, it sounds like you’ve tried modifying the Hessian by flipping signs of negative eigenvalues, which left you with ill-conditioned matrices and are now aiming for the procedure described in section 3.4 under the heading “Modified Symmetric Indefinite Factorization”? If ill conditioning was the problem before that seems somewhat expected from the strategy you were using to modify the Hessian. You should be able to increase the problematic eigenvalues to something somewhat small (but not too small) as in equation (3.41) in Nocedal and Wright (2nd edition) and have as good convergence as you will get with anything else. In that equation, you can pick \delta to avoid ill conditioning. The modification based on LDL^T (in some form) should be faster on a per iteration basis and you might not do much (or any) worse on convergence. But there is some risk in using LDL^T of computing a modification of the Hessian that is larger than you need, which could slow convergence a bit. Pivoting potentially matters both for stability and in the size of the Hessian modification you will get out at the end.
To answer your question about stability: I’m a little thrown off on the wording. I’m guessing when you write H=PLDL^TP^T you mean for D to be diagonal. In this context, it’s somewhat common for D to represent a block diagonal with 2\times 2 blocks. I would normally read it that way, but I suspect that’s not what you mean. Here is the situation:
If A is symmetric positive definite, you have A = LDL^T where D is an ordinary diagonal (i.e. 1\times 1 “blocks”). This can be computed in a stable manner without pivoting. You can also just do Cholesky in this case.
If you are looking for exactly diagonal D for an indefinite matrix, it’s a bit worse than unstable. The decomposition you are looking for does not even exist in general for symmetric indefinite A, even with symmetric pivoting. The standard example is
A = \left[ \begin{array}{cc}
0 & 1 \\
1 & 0
\end{array} \right]
It has nothing to do with sparsity, since this matrix could of course be stored in a sparse format. It’s also tridiagonal, so being tridiagonal does not help. To see why this doesn’t work, first note that there are only two possible permutations P=I and P=A, both of which leave P^T A P = A. So you can ignore permutations. If you try to write
A =
\left[ \begin{array}{cc}
1 & 0 \\ l_1 & 1
\end{array} \right]
\left[ \begin{array}{cc}
d_1 & 0 \\ 0 & d_2
\end{array} \right]
\left[ \begin{array}{cc}
1 & l_1 \\
0 & 1
\end{array} \right]
and multiply that out, you quickly find that d_1=d_2 = 0 which clearly doesn’t work. Hence the decomposition simply doesn’t exist. Another way to look at it is that if you were to look at scalar (1\times 1) pivots, your only choices with a symmetric P^T AP pivoting approach are the zeros on the diagonal of A. You have to either give up symmetric pivoting (and perhaps compute P^T A = LU) or accept that you will have some 2\times 2 blocks on the diagonal. Even when you don’t have exactly zero pivots on the diagonal, small pivots cause numerical problems, as with LU factorization. So you need some pivoting strategy that is symmetric and avoids small scalar pivots, not just the zero pivots. Bunch-Kaufman is one such strategy that achieves that by using 2\times 2 pivots when they are a better choice numerically than a scalar pivot. (Bunch-Parlett pivoting is another somewhat stronger but somewhat slower strategy; it is analogous to the relation between partial and complete pivoting in standard LU factorization). If you want a method that preserves symmetry and applies to an arbitrary indefinite symmetric matrix and is stable, you are going to need pivoting and will have to put up with 2\times 2 blocks on the diagonal.
In terms of what’s in Julia, I haven’t used these routines in Julia before. But I tried the following:
using LinearAlgebra
using SparseArrays
A=[0.0 1.0; 1.0 0.0]
#F1=ldlt(sparse([0.0 1.0; 1.0 0.0]))
#F2=ldlt(SymTridiagonal([0.0, 0.0], [1.0]))
F3=bunchkaufman([0.0 1.0; 1.0 0.0])
I have commented out the ldlt
calls because they both throw zero pivot exceptions. The sparse version of ldlt
calls SuiteSparse which is a little hard to dig into. (Presumably it is pivoting to minimize fill-in and not worrying about zero or small pivots). The call fo the SymTridigonal
ultimately calls
function ldlt!(S::SymTridiagonal{T,V}) where {T,V}
n = size(S,1)
d = S.dv
e = S.ev
@inbounds for i in 1:n-1
iszero(d[i]) && throw(ZeroPivotException(i))
e[i] /= d[i]
d[i+1] -= e[i]^2*d[i]
end
return LDLt{T,SymTridiagonal{T,V}}(S)
end
which does no pivoting. So I think it’s safe to say that neither of the methods for ldlt
take into account stability or even existence of the decomposition in the case of zero pivots. I think bunchkaufman
is really your only choice here if you want something generally applicable, reliable, and readily available in Julia. Unless you want to revisit the eigenvalue approach.
I don’t know about displaying the factors if you do use ldlt
on a sparse problem. It looks like an issue of how to convert a SuiteSparse representation to something you can access more directly in Julia. I don’t know how to do that and am curious about that as well. But in your case, you really shouldn’t be using routines that don’t pivot for stability, so perhaps the question is moot.