Hi, there are a couple matrix multiplication optimizations that would have a big impact on the performance of my project. They are:
- Automatic coalescing of additions with multiplications, since *gemm APIs typically allow for an add as well
- Only computing one triangle of a matrix product if I know that it’s a symmetric matrix
Am I missing these in the docs, or do they exist somewhere (or maybe in some third-party Julia project)?