Algorithmic complexity of matrix multiplication in Julia

IlliaFaraji · March 18, 2021, 12:23pm

This is a slightly odd question, but I am working on a project for which I would like to know the algorithmic complexity of matrix multiplication in Julia. i.e. if A, B are two dimensional n x n arrays stored in the usual way (no sparse arrays etc.), what is the worse case for the complexity of the operation A*B in Julia?

Thanks in advance for your responses.

nsajko · March 18, 2021, 12:38pm

I think this depends on the specific implementation of BLAS in use. I guess they mostly use something like this:

ChrisRackauckas · March 18, 2021, 1:04pm

That method (and the others <3) are not competitive at normal matrix sizes.

stevengj · March 18, 2021, 1:16pm

No, virtually no optimized BLAS uses Strassen’s algorithm in practice to my knowledge. The serious performance studies I’ve seen (which use an optimized BLAS for the base case) show that the crossover m for one step of m\times m Strassen to be marginally better is a few thousand, but for it to be clearly superior you need to be > 10^4, at which point most applications are starting to switch over to sparse algorithms or other algorithms that exploit special structure. Moreover, once you throw in multithreading and the fact the underlying BLAS codes are being continually tuned for new hardware, no one seems to want to put in the effort to keeping a highly tuned Strassen up to date. Finally, Strassen requires large temporary buffers, which are problematic because the huge matrices where it would be theoretically useful are often near the limits of available memory.

The upshot is that nearly all practical dense matrix–matrix algorithm implementations have \Theta(m^3) arithmetic complexity.

stevengj · March 18, 2021, 2:05pm

That being said, since optimizations and parallelism can be easier in a high level language, and memory is getting more plentiful, so it might be nice to see a Strassen package to perform fresh experiments with, perhaps built on top of Octavian.jl.

For example, you could implement a special matrix type with storage optimized for Strassen-type algorithms.

foobar_lv2 · March 18, 2021, 2:14pm

Julia uses a conservative choice of openBLAS / MKL for linear algebra. These don’t use fast (ie subcubic) matrix multiplication, and afaiu there are no plans to change that in the near future (neither on julia nor on openBLAS/MKL side).

There are alternative competitive libraries like https://github.com/flame/blis (doesn’t use fast matmul either, but afaiu serves as the base for most attempts to make strassen useful in real life), and there is active research to use fast matrix multiplication in real life.

Consider e.g. https://apps.cs.utexas.edu/apps/sites/default/files/tech_reports/GPUStrassen.pdf or https://arxiv.org/pdf/1605.01078.pdf : both papers report cross-overs on the order of 2k-10k square matrices.

So I guess it’s not too unrealistic to see BLIS as a third BLAS provider in julia that by then actually uses Strassen, in maybe 3-5 years.

Oscar_Smith · March 18, 2021, 2:17pm

http://jianyuhuang.com/papers/sc16.pdf actually shows that you can implement Strassen using only the packing vectors already used in OpenBlas, and the performance crossover can be as low as 512x512 (at least for single threaded). Getting a good implementation in Julia would be really cool.

simeonschaub · March 18, 2021, 2:18pm

Be careful what you whish for : GitHub - JuliaLinearAlgebra/BLIS.jl: This repo plans to provide a low-level Julia wrapper for BLIS typed interface.

Specifically, Update to use libblastrampoline · Issue #3 · JuliaLinearAlgebra/BLIS.jl · GitHub should make this a lot more straightforward in Julia 1.7.

ChrisRackauckas · March 18, 2021, 2:18pm

Are there any Strassen GPU results? That might be where the value is, if there’s enough memory.

foobar_lv2 · March 18, 2021, 2:19pm

IlliaFaraji · March 20, 2021, 12:57pm

Thank you everyone for the responses. I did not expect the community to be this helpful.

Topic		Replies	Views
Specialized matrix-matrix multiplication algorithm New to Julia question , performance , linearalgebra	5	418	July 9, 2024
Julia matrix-multiplication performance Performance linearalgebra	20	8668	October 30, 2022
Faster matmul Numerics	3	483	March 9, 2024
Native Julia gemm implementation Performance	16	3569	May 3, 2018
Is there an easy way to parallelise matrix multiplication? Performance	8	4717	April 19, 2019

Algorithmic complexity of matrix multiplication in Julia

Related topics