Faster way for matrix multiplication including a Hermitian matrix with special properties

I have a simple question that I hope can make certain kinds of matrix multiplications faster in my code.
Is there a way to make use of matrix multiplications that are of the form phi*P*adjoint(phi) to make the computation faster? where P is a low rank Hermitian matrix, specifically it has eigen values of only ones and zeros (half of the eigenvalues are ones). phi is in general a complex matrix (not a square matrix) but its columns are orthogonal.
I would think that the combination of the Hermiticity of P, its low rank and the orthogonality of the columns vectors of phi can work to optimize these kinds of products. It seems like the standard linear algebra package doesn’t make a use of that as illustrated in this example

using LinearAlgebra, BenchmarkTools
A1 = rand(ComplexF64, 10, 10);
A = A1 + adjoint(A1); #make a hermitian matrix
phi = rand(ComplexF64, 1000, 10);
function test(phi, A) # test the performance when A is Hermitian 
	return phi*A*adjoint(phi)
end 
@btime test(phi, A)

function test2(phi, A1) # test the performance where A1 is not Hermitian 
	return phi*A1*adjoint(phi)
end
@btime test2(phi, A1)

The time of the calculations with a Hermitian matrix and a not Hermitian matrix comes out to be the same where if The symmetry is used; the Hermitian matrix calculation should be roughly half the time.
Any insight about this problem would be appreciated, Thanks!

1 Like

It could be worthwhile explicitly using the Hermitian constructor: see Linear Algebra · The Julia Language

As the previous person suggested, you should wrap the matrix in Hermitian(A) if you want Julia to know it’s Hermitian. There are only a handful of functions where Julia speculatively checks for properties like Hermitian or triangular or diagonal to decide what algorithm to use. That said, I don’t think it will help here in any case.

If your A has a symmetric factorization (such as a Cholesky or LDL^\top factorization), then you can get the first of your two multiplies for half cost and change your overall computation (e.g., to (phi * L) * D * (phi * L)', for the LDL^\top case). But unless A is small relative to phi (like in your example), this factorization may be more expensive than the savings it offers. In any case, it’s something you should apply manually.

So, of the two multiplies, you can save half on the first if A is symmetrically factorized. Further, you could* save half on the second because the result is Hermitian.

*HOWEVER, I am not aware of a routine in BLAS (or other libraries or Julia) for a matrix multiply function that only computes half of the output matrix. This would be necessary for the savings on the second multiply. So unless someone else does, those savings are off the table. This leaves you with either no savings (general A) or 25% total savings (suitably factorized A).

That’s my assessment, anyway. Maybe someone else knows of a suitable half-output matrix multiply? Or some radically different approach?

2 Likes

Thanks @jd-foster and @mikmoore for your replies. Just to clarify if ishermitian(A) returns true, is it also still required to do Hermitian(A) for the linearalgebra package to specialize that A is Hermitian? (I’d like to know this for future reference even if it doesn’t help me in this specific example). It seems like by itself the LinearAlgebra package doesn’t speed the calculation even if I explicitly add Hermitian(A).
@mikmoore wouldn’t what you describe also work if A has an eigenvalue decomposition?

For some functions, Julia will check the structure up-front to decide what algorithm to use. edit (or the macro version @edit) is useful if you want to answer this question for some specific function. See edit(\, (AbstractMatrix, AbstractVector)) and edit(eigen, (AbstractMatrix,)) for examples of functions that do.

But I wouldn’t count on this in general. It’s best to wrap with Hermitian manually if you know your matrix has that structure and you think it might have a specialized algorithm.

Not really. It’s a Hermitian matrix, so the eigendecomposition is V*D*V' for a unitary V and diagonal D. So we could implement your product as

phi_V = phi * V
result = phi_V * D * phi_V'

Let’s use your example sizes size(phi) == (1000,10) and size(A) == size(V) == size(D) == (10,10). V likely doesn’t have a sparsity pattern we could exploit to multiply more efficiently, so phi*V requires 1000*10*10 multiply-adds to produce a size == (1000,10) result. Then phi_V * D requires 1000*10 multiplies because D is diagonal. Then the final product requires 1000*10*10 multiply-adds, which we could reduce to roughly 1000*10*(10+1)/2 if we had a function to only compute half the result (which I don’t believe we do, nor has BLAS felt the need to have one).

Notably, phi*V is equally expensive as phi*A. The fact that the eigendecomposition method requires the application of the diagonal matrix means it actually requires slightly more work than phi * A * phi'.

The reason that a L*L' or L*D*L' factorization helps is that L is triangular. Multiplying phi*L needs only half the computation of phi*V or phi*A. But if you don’t already have A appropriately factorized, factorizing it may cost more than it saves.


The short summary is that you could theoretically shave 25% of the calculation with a suitably factorized A and another 25% if you can find an optimized half-output matrix multiply (if you write it yourself and aren’t a BLAS expert, it will likely be much slower than producing the full output with an optimized version). Go ahead and do what you can with what you have handy, but I expect it isn’t worth getting too tangled up in trying to optimize this.

3 Likes

Thanks @mikmoore for your explanation. I’m not sure that my A has a symmetric factorization. (edit: an eigenvalue decomposition of A will give a diagonal matrix of ones and zeros. So I think this means that it doesn’t have a symmetric factorization as it is not positive definite) This means, however, that it is low rank, and that also should factor in speeding up the multiplication. I have edited the main post to include this information.
It’s kind of a bummer that there is no built in optimization for these kind of products :slight_smile: especially I’d think they’re pretty common. And yes, I’m not really the qualified person that can take on this task of developing a package for that :slight_smile:

We do have a specialization for dot(x,A,y) == x'*A*y when x and y are vectors, but that doesn’t apply to your case.

If you eigendecomposition has zeros, your matrix is rank deficient. Since the matrix is Hermitian and the nonzero eigenvalues are positive, it is positive semidefinite. This means that you can write your A as B'*B where size(B) == (R,N) with R == rank(A) < N. This will reduce the cost to R/N of the naive cost on both multiplies.

Finding a low rank factorization is something you can do from an eigendecomposition. Just drop the (near-)zero terms. But you might already have A in a low rank form based on how you computed it in the first place.

Also your final product phi * A * phi' is very low rank. If you need all the entries, then fine. But if you’re using it in further calculations, keeping it low-rank factorized is likely much more efficient.