Issue: PDMats.jl matrices losing positive-definiteness with Optim.jl autodiff

stevengj · October 25, 2025, 11:44am

You could also just use a change of variables to do nonlinear optimization over semidefinite matrices (which you have to include in “SPD” for optimization since the set of strictly positive-definite matrices is not closed). That is, instead of optimizing f(A) over “SPD” matrices A, you optimize f(X^T X) over unconstrained m \times m matrices X, or better yet f(Hermitian(X'X)) to ensure that it is exactly Hermitian despite roundoff errors.

This has no loss of generality, allows you to use any nonlinear optimization algorithm, and is easy. (The downside is that you might lose some special structure in f that would otherwise let you use something specialized like an SDP solver, but I’m guessing you don’t have a straightforward SDP or you would be using such a solver already.)

You can also, of course, use specialized methods for semidefinite constraints. And, of course, it depends a lot on what you want to do with the SPD matrix in your function. If you want to solve linear systems, for example, it would be more efficient to optimize over the vector space of lower- or upper-triangular matrices X, since then you have the Cholesky factorization for free.

Note, however, that X^T X may be slightly indefinite due to roundoff errors, so your f(X^T X) computation needs to be robust to such things. This is a general issue to be aware of when working with SDP matrices; how to ensure robustness depends on what precise calculation you’re performing, but a general idea is to treat slightly negative eigenvalues as zeros. Alternatively you can add some regularization like replacing f(A) with f(A + \varepsilon \Vert A \Vert_F I / m) for a small \varepsilon proportional to the precision. This has come up many times on this forum, e.g.:

(But this may not be an issue if your optimization problem does not favor matrices that are close to singular, i.e. if the SDP constraint is inactive at the optimum — in that case, as long as you start with a random X so that X^T X is far from singular, probably the gradient steps will keep things that way.)

Topic		Replies	Views
Maximum Likelihood Multivariate Model Optimization (Mathematical) optim	15	1646	January 10, 2020
Newton optimization with Optim doing strange things Optimization (Mathematical)	36	2919	April 12, 2018
Solve for Hermitian PSD with non-linear objective in JuMP Optimization (Mathematical) jump	19	1157	May 27, 2021
Type stable matrix factorization corner case with ForwardDiff Numerics question	9	1091	December 19, 2017
Find gradient of a posterior Statistics question	12	1041	May 16, 2019

Issue: PDMats.jl matrices losing positive-definiteness with Optim.jl autodiff

Related topics