Nice. The timings look about like I would have expected. The schur factorization is definitely much more involved.
You might get lucky on specific problems. But once you have done the pivoted Cholesky the rest of your proposed algorithm is basically a block elimination step on a semidefinite matrix using E_{zz} as a block pivot without any further permutations. Even with easy rank decisions in the initial Cholesky factorization, that’s just not stable in general. Fix-Heiberger might be more robust since it at least tries to deal with ill-conditioning of E_{zz}, but then you have more rank decisions and those might be harder than the rank decisions in the original pivoted Cholesky. I could see that going wrong as well.
It might depend on your goals. If you are just writing code for your own use on a specific problem or set of problems, maybe you can get away with this. You can check the residuals to spot any problems and maybe even use iterative refinement to help with stability problems in cases in which you have a large residual. But this doesn’t seem like a good thing to put in code written for others that they might expect to be really robust. You are definitely venturing away from best practices for implementing robust numerical code.
(These plots are begging for a log–log scale, since you are expecting power laws here.)
I meant doing Pivoted Cholesky on the SPSD matrix and then the Hessenberg approach as above.
Basically what @stevengj proposed just with Pivoted Cholesky as the matrix is SPSD.
But doesn’t that require you to get it to the form A+\lambda I? It’s that reduction involving \boldsymbol{Q} = \boldsymbol{E}_{SS} - \boldsymbol{E}_{SZ} \boldsymbol{P}, etc. that I don’t entirely trust.
Although I have been thinking about it some more and I’m less confident that it would be a problem. Is \lambda positive by any chance?
I am encountering a similar problem. I was thinking that perhaps using the Generalised SVD could possibly be better, have you thought of that? Linear algebra is not my strength, but for the case that E = I, there is a nice solution through SVD where you can just plug the \lambda in. That way we can reuse the expensive factorization for many \lambda s.
(If A is Hermitian, as it is here and in other regression-type problems, then an SVD is equivalent to eigen-diagonalization, which was already discussed above. If A is not Hermitian, then I’m not sure how an SVD helps you — it seems more fruitful to use Hessenberg or Schur factorizations for shifted solves of non-Hermitian square systems.)
1 Like
No, I mean just apply the Pivoted Cholesky on \boldsymbol{C} (See the first post) which is SPSD. This should give a regular Cholesky form of 2 Triangular matrices. Then just follow Hessenberg. I just don’t know what’s the implications of that.
Perhaps I’d need to see some equations to understand. From that verbal description, it is not at all clear to me what you have in mind here. And I’m not sure what you mean by the first post.
I will try to do a recap of what we have.
The original problem is solving the following linear system:
\left( \boldsymbol{A} + \lambda \boldsymbol{C} \right) \boldsymbol{x} = \boldsymbol{A}^{T} \boldsymbol{b}
Under the assumptions:
- \boldsymbol{A}, \boldsymbol{C} \in \mathcal{S}_{+}^{n} (Symmetric Positive Semi Definite).
- \lambda > 0.
Option A
Solve the problem directly. Use Bunch Kaufman Factorization to take advantage of the symmetry.
If the directions of the zeros of the matrices guaranteed to not align you may use Cholesky.
This is what I knew and started with.
Option B
Use QZ Decomposition / Schur Decomposition as described in @mstewart 's post.
This solves the problem as it is.
Option C
If we can assume at least one of \boldsymbol{A} or \boldsymbol{C} is SPD follow the solution of decomposing it by Cholesky Decomposition.
Then the 2nd decomposition can be Hessenberg or Eigen Decomposition. See @stevengj 's post for a sketch.
I have known this using the Eigen Step, yet it does not solve the problem as stated as it is assumed either matrix is SPD.
Option D
Use Eigen / LDL Decomposition to create a problem of Symmetric + Non Negative Diagonal.
Solve as described in my post.
It might raise numeric issues as raised by @mstewart.
Option E
I am asking what would happen if we use Option C even though \boldsymbol{C} is SPSD (With at least 1 zero eigen value).
Instead of using Cholesky, we’ll be using Pivoted Cholesky on \boldsymbol{C}. The decomposition will work.
The question is, what are the implications of having, due to numeric inaccuracies, small negative eigen values in \boldsymbol{C}. Let’s say -1e-12
.
The thing that is confusing me is that, unless I missed something, option C assumes matrix C is positive definite, not semidefinite. The issues with extending option C go beyond just getting a Cholesky decomposition (pivoted or otherwise) of matrix C. You still are left with something that looks like
\left( \begin{bmatrix} \boldsymbol{E}_{SS} & \boldsymbol{E}_{SZ} \\ \boldsymbol{E}_{SZ} & \boldsymbol{E}_{ZZ} \end{bmatrix} + \alpha
\begin{bmatrix} I & 0 \\ 0 & 0 \end{bmatrix}
\right) \begin{bmatrix} \boldsymbol{y}_{s} \\ \boldsymbol{y}_{z} \end{bmatrix} = \begin{bmatrix} \boldsymbol{f}_{s} \\ \boldsymbol{f}_{z} \end{bmatrix}
So I’m not seeing how Hessenberg decomposition applies to this directly. You had some ideas on that with your matrix Q, etc. Mathematically, I think those were OK, but the numerics seem less clear, although maybe not hopeless. Now you seem to be indicating that that is not what you are thinking. But I don’t see a more direct way. So I’m at a loss as to what you are proposing to get a problem to which a Hessenberg reduction will be relevant unless it is what you outlined before.
This is exactly my question.
What guarantees do I have from the Pivoted Cholesky?
Can I follow Option C with \hat{\boldsymbol{U}} = \boldsymbol{U} \boldsymbol{P}^{T}?
Otherwise, I think I might think about the “Engineering Path” by just adding small \gamma \boldsymbol{I} (Say \gamma = 5 \mathrm{e}{-10}) to \boldsymbol{C} and be done with it.
You can certainly apply the permutation to the Cholesky factor. I don’t see how that helps with the problem that you don’t end up with something that looks like E + \lambda I. Instead of I you get I plus a block of zeros in the lower right corner. The pivoted Cholesky is the easy part. The fact that it doesn’t give you a matrix of the form \lambda I is the hard part.
You would be computing a Cholesky factor of a perturbed singular matrix and then using it as a transformation of A. I would expect this to result in more or less completely inaccurate results. (i.e. no correct digits.)