New 4x4 algorithm found

photor · May 22, 2025, 9:36am

Actually, I don’t think the optimization of 1/8 multiplications can be done in Julia. It is likely that this kind of optimization should be done at some lower levels, like llvm or native, or even hardware.
For the usefulness of such kind of algorithms, the key point is that it is not used to simply multiply two 4x4 matrices. Instead, its use cases are when the elements of the 4x4 matrices are also matrices, i.e. block matrices, just like how you use the Strassen algo: multiplying two very large matrices by recursively partitioning the matrices as 2x2 block matrices and using Strassen. In this case, as I stated in a previous post, the “active multiplications” are expensive as their complexity is N^3 (with the naive method), while the “inactive multiplications” are cheap as their complexity is N^2. That is the complexity hierarchy, which with recursion makes the asymptotic complexity of Strassen reduced to N^2.80735, regardless of how you optimized your “inactive multiplications”. The optimization of inactive multiplications only affects the matrix sizes down to which the Strassen algo is still faster than the naive method, i.e. the proper endpoint of the recursion. The new 4x4 algo is similar: due to the complexity hierarchy, for large enough matrices, this new algo is always faster than the naive method (and also Strassen) by partitioning. But if you do not optimized the inactive multiplications sufficiently, the threshold of matrix sizes, below which applying the new algo is no longer beneficial, is so large (like 10000x10000) that the new algo can hardly be used in real life.
If you want to see a working code for (recursive) Strassen in Julia, you can refer to this post:

Topic		Replies	Views
Matrix multiply breakthrough, AlphaTensor (could also do for other algorithms): "AlphaTensor discovered algorithms that are more efficient than the state of the art for many matrix sizes." Offtopic linearalgebra	6	3342	May 30, 2025
Vector - Matrix - Vector multiplication Performance	19	4228	March 14, 2021
Why mul! is so fast? General Usage question , linearalgebra	7	6871	November 26, 2019
Paper for discussion: The Linear Algebra Mapping Problem Internals & Design	13	1403	November 25, 2019
Efficient approach to multiply three matrices (M1M2M3) and two vectors and a matrix (xMy) Performance linearalgebra	18	5786	July 10, 2023

New 4x4 algorithm found

Related topics