How to restrict a matrix to be positive definite

amrods · February 26, 2025, 12:50am

I am estimating a model. In the process of estimation I have to convert a fixed independent and normally distributed vector to a correlated multivariate normal vector. The way I am doing that is using this:

\bf{x} \sim N(\bf{0}, I) \Rightarrow S \bf{x} \sim N(0,SS').

That means that if I can obtain a matrix \bf{S} such that \bf{S}\bf{S}' is positive semi definite, I can premultiply vector \bf{x} by \bf{S} to obtain a correlated normal vector. The problem I’m facing is that I am parameterizing matrix \Sigma = \bf{S} \bf{S}'. Say it is 3x3:

Σ = [s11 s12 s13;
     s12 s22 s23;
     s13 s23 s33]

Those are the parameters I am supplying the optimizer. I am obtaining \bf{S} using a Cholesky decomposition.

How can I impose restrictions on the components of Σ such that it is positive semi definite (and therefore the Cholesky decomposition exists)?

I thought about parameterizing \bf{S} instead, and then constructing Σ = S * transpose(S). The problem in that case is that Σ ends up having all positive values and I want to allow for some negative components in it (allowing for negative covariance between components of \bf{x}).

Here is a simple example:

using LinearAlgebra

x = randn(3, 100)

function obj(s; x)
    s11 = s[1]
    s12 = s[2]
    s13 = s[3]
    s22 = s[4]
    s23 = s[5]
    s33 = s[6]

    Σ = [s11 s12 s13;
         s12 s22 s23;
         s13 s23 s33]
    S = cholesky(Σ).L
    y = S * x 
    return f(y)
end

f(y) = sum(y)

Again, I want to be able to supply function obj to an optimizer, which means that it should be able to accept any vector s. What are restrictions I can place on Σ to ensure it is positive semi definite? For example, the main diagonal elements must be non-negative, which can be achieved with an exp transform.

Joris_Pinkse · February 26, 2025, 12:55am

Use the function isposdef after using LinearAlgebra?

amrods · February 26, 2025, 12:57am

That only checks for psd, it does not restrict Σ to be one.

Joris_Pinkse · February 26, 2025, 1:03am

What do you mean by restricting it to be one then?

amrods · February 26, 2025, 1:33am

What I want is to make function obj able to accept any inputs, transform them, and make a psd matrix from them. That way, I can safely supply function obj to an optimizer without having to worry whether the function will error because it can’t construct psd Σ.

Joris_Pinkse · February 26, 2025, 1:37am

In that case, something like starting with the Σ matrix that you have. Then,

Σ = Symmetric( Σ+Σ’) / 2

to make it symmetric. Then compute its minimum eigenvalue λ.

Then add (c-λ) * I to Σ

where c is the desired smallest eigenvalue (some positive number)

amrods · February 26, 2025, 1:47am

Thank you. Does that method “span” all positive psd matrices? That is, can I construct every possible psd matrix by varying c and Σ?

Joris_Pinkse · February 26, 2025, 2:01am

Yes

amrods · February 26, 2025, 3:29am

Since c is a parameter of my choice, do I also include it with the rest, so that the optimizer can control it?

Oscar_Smith · February 26, 2025, 4:20am

I thought about parameterizing instead, and then constructing Σ = S * transpose(S). The problem in that case is that Σ ends up having all positive values and I want to allow for some negative components in it (allowing for negative covariance between components of ).

I think this is probably the better approach. That said, I’m not sure why you would be seeing Σ ending up having all positive values. Were you making S be symmetric rather than triangular?

julia> S = [1 -1
            0  1];

julia> S*S'
2×2 Matrix{Int64}:
  2  -1
 -1   1

amrods · February 26, 2025, 5:51am

Ah yes, I was making that mistake!

Alexander_Knudson · February 26, 2025, 3:07pm

I’m not exactly sure how to solve your problem, but there’s the NearestCorrelationMatrix.jl library that can find a positive definite correlation matrix from an input matrix. Can you compute the the covariance matrix from the correlation matrix?

bertschi · February 26, 2025, 4:52pm

For optimization over constrained spaces, I usually reach for TransformVariables.jl providing transformations onto unconstrained vectors. In particular, for covariance matrices check out corr_cholesky_factor.
In case you want to get really fancy, you can also use Manopt.jl and optimize over the manifold of symmetric positive definite matrices directly.

amrods · March 2, 2025, 6:03am

I was checking TransformVariables.jl, the author says that the densities need to be adjusted. Why is that necessary? What I had in mind to do is to employ the delta method to obtain the correct variances of the transformed parameters (those in \Sigma) from the obtained estimates (those in S).

bertschi · March 5, 2025, 5:24pm

When you change your parametrization, e.g., mapping from an unconstrained vector x \in \mathbb{R}^n to some constrained object \theta \in \Theta, the density changes accordingly. Let g: x \to \theta, then

p(x) = p(\theta) \left| \frac{d\theta}{dx} \right|

where \left| \frac{d\theta}{dx} \right| denotes the log Jacobian of g.
E.g., in Bayesian estimation this is important when the prior is defined on \theta, but the sampling is done on x. When optimizing the likelihood this is usually not required as the maximum is not effected by the coordinate transform, e.g., \mathrm{argmax}_{\theta} p(D | \theta) = g\left( \mathrm{argmax}_{x} p(D | g(x))\right).
Don’t know how/if that applies to the delta method … hope it helps anyways.

amrods · March 5, 2025, 10:35pm

Thank you! Is there any name for that? Can you provide some references where I can read more about it?

What I had in mind to do was maximize the likelihood with respect to unconstrained vector, obtain the variance covariance of those parameters, and then use the delta method to obtain the variance covariance for the transformed parameters:

\hat{x} = \arg \max_x \ell(p(x)).

The variance covariance of \hat{x} is

var(\hat{x}) = -\left(\frac{\partial^2{\ell}}{\partial x^2}(\hat{x})\right)^{-1}.

Ultimately, I am interested in the variance covariance of the transformed parameters \hat{\theta} = p(\hat{x}). For that, I apply the delta method:

\sqrt{n} (p(\hat{x}) - p(x)) \sim N(0, h'var(\hat{x})h),

where h = \frac{\partial}{\partial x}p(x)'.

bertschi · March 6, 2025, 6:00pm

In the context of Bayesian modeling, the Stan manual discusses reparametrization and change of variables.
If I get your case correctly, you maximize the likelihood on the unconstrained space, approximate the uncertainty there via the Laplace approximation, i.e., using the Hessian as inverse variance, and then use another approximation – the delta method – in order to obtain the variance on the constrained space? Here, I would probably skip the delta method altogether and simply use the Laplace approximation on the constraint space directly, i.e., compute \frac{\partial^2 \mathcal{l}}{\partial \theta}(\hat{\theta}) directly. In any case, you won’t need to correct for the change of variable as the transformation is applied to the point estimates \hat{x} \to \hat{\theta} only.

Philippe_Maincon1 · March 6, 2025, 6:38pm

Any symmetric positive definite (SPD) matrix has a unique Cholesky decomposition
M=L*L'
where L is lower triangular. Hence the space of n by n SPD matrices has dimension n*(n+1)/2, and a way to constrain an algorithm to only explore all such matrices and only them is to use the upper triangular coefficients of L as unknowns.

Topic		Replies	Views
Random draws of multivariate normal with positive semi-definite covariance matrix Statistics linearalgebra	4	1095	March 26, 2022
Verify a matrix is positive semi-definite General Usage	13	9108	April 13, 2020
Optimizing over real matrices with eigenvalues in $(0,1)$ Optimization (Mathematical) linearalgebra	3	533	November 5, 2020
What is the best way to factorize/decompose a covariance matrix? Numerics question , linearalgebra	21	4330	March 9, 2020
Generate Covariance Matrix with Pre-Specified Restrictions Statistics question	6	862	July 23, 2020

How to restrict a matrix to be positive definite

Related topics