A has orthonormal columns, but is rectangular (tall). Currently, I allocate caches
D, and evaluate
mul!(D, A', mul!(C, B, A)
In my use case,
B are somewhat large matrices, so I’d ideally like to avoid allocating the intermediate matrix
C. I was wondering if there’s a way to use the structure of
A to come up with a smarter algorithm? Unfortunately,
B is a general (maybe complex) matrix, and doesn’t have an obvious structure.
By your setup, we know that B is square, but we can’t assume anything about symmetry?
Is A very “thin” relative to B? Could it be feasible to compute a truncated SVD to the number if columns of A, then transform the singular vectors by A?
A and B are roughly similar-sized (eg. if
B is 3000 x 3000,
A is 3000 x 2880). B is indeed square, but there’s no symmetry.
Do you have to compute each product only once, or multiple times with the same B (e.g. A_i^* B A_i for i=1,2,\dots) or with the same A?
I do need to compute the product multiple times for different
B, but the same
If you need to perform the multiplications really a lot of times AND matrix A has an SVD decomposition A = U S V’ which can be truncated without altering the representation of A, then you might try to pay the price for the SVD decomposition and gain during the calculation of A’B A since there you multiply smaller matrices.