Hereâ€™s an example where a matrix that is sufficiently large and rank-deficient benefits from something fancier:

```
using KernelMatrices, BenchmarkTools
const M = randn(50, 10_000);
const A = M'M;
const x = randn(10_000)
function fact_method(A,x)
(U,V) = KernelMatrices.ACA(A, 1e-14)
dot(U'*x, V'*x)
end
@btime dot($x, $A, $x) # 27.1 ms on my machine w/ 4 LAPACK threads
@btime fact_method($A, $x) # 27.2 ms with same setup.
```

I picked a rank that is about the break-even speed, but if you make it lower than 50 then the `fact_method`

wins.

This factorization is the adaptive cross approximation, which I think is reasonably comparable to an LU factorization with a fancy pivoting strategy, but which chooses the pivots based on partial information to avoid passing over the whole matrix. Iâ€™m not super knowledgeable about the lowest level details about pivoting strategies, though, so that might be incorrect. In any case, I implemented it here a long time ago. I donâ€™t think itâ€™s embarrassingly slow, but that package was one of my first big Julia projects and so while I havenâ€™t seen anything better I bet somebody like @stevengj could sit down and write something much faster in the timespan of a coffee break.

I bring all this up to say that the code isnâ€™t great and the factorization as currently implemented doesnâ€™t specialize for symmetry, but even still for non-ridiculous matrix conditions it does beat the naive `dot(x, P, x)`

. I would bet that a smarter implementation could be even more competitive. So I do think that the answer will really depend on some parameters of the problem like size and rank.