Where can I find certain matrix multiplication optimizations?

Hi, there are a couple matrix multiplication optimizations that would have a big impact on the performance of my project. They are:

  • Automatic coalescing of additions with multiplications, since *gemm APIs typically allow for an add as well
  • Only computing one triangle of a matrix product if I know that it’s a symmetric matrix

Am I missing these in the docs, or do they exist somewhere (or maybe in some third-party Julia project)?

For your first point, have you seen mul! ?

Note also that many special matrix types have specialized methods to make things more efficient, e.g. Symmetric.

See e.g. BLAS.syrk! and syr2k!.