Large matrix operations involving inversion

In a particular problem, I have the following set of three evaluations involving matrices A, B, C of the SparseMatrixCSC type as

B' * A^{-} * B
B' * A^{-} * C
C' * A^{-} * C

Here,
shape(A) = (n, n)
shape(B) = (n, p)
shape(C) = (n, 1)

I have to deal with n > 1_000_000 and p up to 15.

I am currently using the following piece of code,

B' * (A \ B)
B' * (A \ C)
C' * (A \ C)

In another case, with n=700_000 and p = 9 evaluation took about 20 minutes on a system with 64GB RAM, 32 logical processors and 2.4GHz specifications.

On the same machine, n = 500_000 along with p = 12 is giving OutOfMemoryError(). Then, I tried on a node of the HPC cluster with 128GB RAM, where it didn’t either give outofMemoryError() or didn’t finish even after 2 hours.
This is despite the fact that the number of elements in the first and second cases is in the same range (about 6_000_000).

Please let me know
1. Is the procedure I am using computationally efficient?
2. Any tricks/workarounds for dealing with large matrices where n>1_000_000 and p>10.

Thanks, in advance.

How many nonzeros?

Is A structured at all? Like, is it symmetric, antisymmetric, something? If it is, there might be some tricks you could do.

Otherwise, one improvement would be to form A_lu = lu(A) and use that instead of A in the operations. Basically it does the slow and hard part of the equation solving upfront, so you don’t have to do it three times.

2 Likes

Thanks, Gustaphe.

Yes, A is the symmetric and positive definite matrix. Also, non-zero terms of A are expected to form a banded pattern about the main diagonal.

then use BandedMatrices.jl it will be way faster (at least 10x).

1 Like

For general SPD matrices there’s the cholesky decomposition (replace lu by chol above). I don’t know how that compares to/interacts with bandedness. There is an implementation in BandedMatrices.jl for LU composition though, so I would probably use that.