Multiplying boolean matrices

e3c6 · November 7, 2021, 10:41am

Why is there such a performance difference between multiplying float matrices and other kinds of matrices? I understand float matrices hit BLAS which is heavily optimized. But is this a fundamental difference or or just that we are lacking a “BLAS for boolean matrices” for example?

Can we expect that in the future with a generic Julia implementation of BLAS this difference will disappear?

Is there a package one can use now to speedup boolean matrices?

e3c6 · November 7, 2021, 10:54am

As suggested by @antoine-levitt, it is interesting to look at the non-BLAS float matmul currently implemented in Julia.

Interesting this is also quite faster than the boolean matrix matmul. Is there some intrinsic difficulty in multiplying against boolean matrices?

Notice that a fast boolean matrix matmul could be useful in deep learning for one-hot represented data.
If the one-hot data could be represented as BitArray the memory usage would be quite smaller.

antoine-levitt · November 7, 2021, 11:10am

Hm, generic_matmatmul still does something fancy (see _generic_matmatmul! in LinearAlgebra/src/matmul.jl) so that’s not quite a plain comparison. You’d have to code your own three-loops algorithm to compare. If you want fast bool matmuls, maybe take a look at GitHub - JuliaLinearAlgebra/Octavian.jl: Multi-threaded BLAS-like library that provides pure Julia matrix multiplication

e3c6 · November 7, 2021, 11:33am

Using Octavian things are faster:

But float matmul still wins significantly.

Unfortunately Octavian doesn’t seem to work with BitMatrix, Error with BitArray · Issue #123 · JuliaLinearAlgebra/Octavian.jl · GitHub.

freemint · November 7, 2021, 6:54pm

Are you interested in a Matmul with an addition where 1+1 = 1 or 1+1=0?
The performance difference comes from indexing into a BitArray which involves slicing to get Int with the correct value. It could be implement a lot faster by bit level boolean operation by and-ing the two vectors and then either checking whether it is non-zero (when 1+1=1) or checking the last bit of the intrinsic popcount (when you want 1+1=0).

Topic		Replies	Views
Int numerical calculation speed slower than Float? Performance	12	1829	February 17, 2020
Why mul! is so fast with BitVector? Performance question , linearalgebra	1	666	November 29, 2019
Native Julia gemm implementation Performance	16	3663	May 3, 2018
Drastic performance hit matrix multiply different types. Internal cast julia vs numpy? Numerics	15	2524	November 4, 2018
Slow matrix multiplication in Julia compared to Python numpy New to Julia question	17	5640	May 19, 2018

Multiplying boolean matrices

Related topics