[ANN] NearestCorrelationMatrix.jl v1

NearestCorrelationMatrix.jl is a package that does exactly what the name implies. This library finds the nearest correlation matrix in the Frobenius norm to an input correlation matrix; a problem that is common in finance.

This started as part of my graduate project, and I’ve continued to improve the stability and interface. The v1 release is based around the CommonSolve.jl interface, so should feel familiar to anyone who uses any SciML package.

Key Algorithms

  • Newton: a quadratically convergent algorithm that should prove fast and accurate
  • AlternatingProjections: a simple linearly convergent algorithm
  • DirectProjection: a one-step projection onto the cone of correlation matrices
  • JuMPAlgorithm: an extension algorithm that is available when JuMP is loaded

Basic Usage (see the README for full details)

julia> using NearestCorrelationMatrix

julia> r0 = rand(4, 4)
4×4 Matrix{Float64}:
 0.0514864  0.849425  0.900972  0.515
 0.212295   0.663351  0.352485  0.702389
 0.964793   0.38255   0.579789  0.970711
 0.918331   0.72088   0.199928  0.566195

julia> prob = NCMProblem(r0);

By default, you are expected to pass in a symmetric matrix, and the solver will complain unless you tell it to fix symmetry for you. The idea is to complain early so that there are no surprises later.

julia> sol = solve(prob, Newton(); fix_sym=true);

julia> sol.X
4×4 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
 1.0       0.768931  0.779333  0.608942
 0.768931  1.0       0.450903  0.626379
 0.779333  0.450903  1.0       0.855849
 0.608942  0.626379  0.855849  1.0

There is also a “batteries included” convenience method that fixes any issues with the input by default, and returns the solution directly:

julia> nearest_cor(r0)
4×4 LinearAlgebra.Symmetric{Float64, Matrix{Float64}}:
 1.0       0.768931  0.779333  0.608942
 0.768931  1.0       0.450903  0.626379
 0.779333  0.450903  1.0       0.855849
 0.608942  0.626379  0.855849  1.0

as well as an in-place version:

julia> nearest_cor!(r0);

julia> r0
4×4 Matrix{Float64}:
 1.0       0.768931  0.779333  0.608942
 0.768931  1.0       0.450903  0.626379
 0.779333  0.450903  1.0       0.855849
 0.608942  0.626379  0.855849  1.0

It’s Fast

For a randomly generated 3000 x 3000 matrix, finding the nearest correlation matrix takes just 22 seconds using the Newton method on a modern computer. By using the MKL.jl linear algebra backend, it takes just 18 seconds!

Other Details

This package spun off from the Bigsimr.jl library that was the core part of my advisor’s research that I helped develop. The docs for that package really motivate the use for this package with an application to breast cancer gene data. Please check out that package as well :slight_smile:

12 Likes

Looks great!
Have you implemented a non allocating version of any of the algorithms?

Thank you!

In general, I’m not sure if I can make a non-allocating version, but I’m also no expert in the area. Speed was my main priority over memory. All the methods rely at least partially on the eigen decomposition of the solution matrix in order to project onto the cone of positive semidefinite matrices, so I don’t think I could avoid all allocations.

It might be possible to make a non-allocating alternating projections method, but I would need a stopping criterion that doesn’t rely on the residual norm between the current iterations solution and the previous. If you have any resources that I could use as inspiration, I would be happy to attempt a non-allocating version :slight_smile:

For now you can pass in alias_A=true into the init / solve methods to tell the algorithm that it’s okay to overwrite the input matrix if the algorithm supports in-place updating.

You may use FastLapackInterface.jl for non allocating eigen decomposition.

It will allow pre allocating the required buffers.

1 Like

Oh sweet! I should be able to work this right in to the Newton workspace. I’ll see if this results in a speedup, or at least reduced allocations.