How to restrict a matrix to be positive definite

I am estimating a model. In the process of estimation I have to convert a fixed independent and normally distributed vector to a correlated multivariate normal vector. The way I am doing that is using this:

\bf{x} \sim N(\bf{0}, I) \Rightarrow S \bf{x} \sim N(0,SS').

That means that if I can obtain a matrix \bf{S} such that \bf{S}\bf{S}' is positive semi definite, I can premultiply vector \bf{x} by \bf{S} to obtain a correlated normal vector. The problem I’m facing is that I am parameterizing matrix \Sigma = \bf{S} \bf{S}'. Say it is 3x3:

Σ = [s11 s12 s13;
     s12 s22 s23;
     s13 s23 s33]

Those are the parameters I am supplying the optimizer. I am obtaining \bf{S} using a Cholesky decomposition.

How can I impose restrictions on the components of Σ such that it is positive semi definite (and therefore the Cholesky decomposition exists)?

I thought about parameterizing \bf{S} instead, and then constructing Σ = S * transpose(S). The problem in that case is that Σ ends up having all positive values and I want to allow for some negative components in it (allowing for negative covariance between components of \bf{x}).

Here is a simple example:

using LinearAlgebra

x = randn(3, 100)

function obj(s; x)
    s11 = s[1]
    s12 = s[2]
    s13 = s[3]
    s22 = s[4]
    s23 = s[5]
    s33 = s[6]

    Σ = [s11 s12 s13;
         s12 s22 s23;
         s13 s23 s33]
    S = cholesky(Σ).L
    y = S * x 
    return f(y)
end

f(y) = sum(y)

Again, I want to be able to supply function obj to an optimizer, which means that it should be able to accept any vector s. What are restrictions I can place on Σ to ensure it is positive semi definite? For example, the main diagonal elements must be non-negative, which can be achieved with an exp transform.

2 Likes

Use the function isposdef after using LinearAlgebra?

That only checks for psd, it does not restrict Σ to be one.

What do you mean by restricting it to be one then?

What I want is to make function obj able to accept any inputs, transform them, and make a psd matrix from them. That way, I can safely supply function obj to an optimizer without having to worry whether the function will error because it can’t construct psd Σ.

In that case, something like starting with the Σ matrix that you have. Then,

Σ = Symmetric( Σ+Σ’) / 2

to make it symmetric. Then compute its minimum eigenvalue λ.

Then add (c-λ) * I to Σ

where c is the desired smallest eigenvalue (some positive number)

1 Like

Thank you. Does that method “span” all positive psd matrices? That is, can I construct every possible psd matrix by varying c and Σ?

Yes

1 Like

Since c is a parameter of my choice, do I also include it with the rest, so that the optimizer can control it?

I thought about parameterizing instead, and then constructing Σ = S * transpose(S). The problem in that case is that Σ ends up having all positive values and I want to allow for some negative components in it (allowing for negative covariance between components of ).

I think this is probably the better approach. That said, I’m not sure why you would be seeing Σ ending up having all positive values. Were you making S be symmetric rather than triangular?

julia> S = [1 -1
            0  1];

julia> S*S'
2×2 Matrix{Int64}:
  2  -1
 -1   1
3 Likes

Ah yes, I was making that mistake!

I’m not exactly sure how to solve your problem, but there’s the NearestCorrelationMatrix.jl library that can find a positive definite correlation matrix from an input matrix. Can you compute the the covariance matrix from the correlation matrix?

1 Like

For optimization over constrained spaces, I usually reach for TransformVariables.jl providing transformations onto unconstrained vectors. In particular, for covariance matrices check out corr_cholesky_factor.
In case you want to get really fancy, you can also use Manopt.jl and optimize over the manifold of symmetric positive definite matrices directly.

2 Likes

I was checking TransformVariables.jl, the author says that the densities need to be adjusted. Why is that necessary? What I had in mind to do is to employ the delta method to obtain the correct variances of the transformed parameters (those in \Sigma) from the obtained estimates (those in S).