Why mul! is so fast with BitVector?

e3c6 · November 28, 2019, 4:17pm

Consider the following example code:

using LinearAlgebra, BenchmarkTools

n = 200
xb = BitVector(rand(Bool, n))
xf = Float32.(xb)
W = randn(Float32,n,n)
y = zeros(Float32,n);

function mymul!(y, A, x)
    y .= 0
    @inbounds for j = 1:length(x)
        @simd for i = 1:length(y)
            #y[i] = muladd(A[i,j], x[j], y[i])
            y[i] += A[i,j] * x[j]
        end
    end
end

Here I find the following benchmarks:

@btime mymul!($y, $W, $xf);   # 4.542 μs (0 allocations: 0 bytes)
@btime mul!($y, $W, $xf);     # 7.541 μs (0 allocations: 0 bytes)
@btime mymul!($y, $W, $xb);   # 15.586 μs (0 allocations: 0 bytes)  (why?)
@btime mul!($y, $W, $xb);     # 7.864 μs (0 allocations: 0 bytes)

So why is mymul! so slow when it is fed with a BitVector, as compared to mul! from LinearAlgebra?

Note that this question is very similar to Why mul! is so fast?. But in that thread I preferred to focus on Float64 arguments to simplify. Here I am referring to a difference specifically for BitVector arguments. I expect that the reason here will be different that in that thread because of the way the bools are stored as bits in a BitVector. Also note that I didn’t find such a big difference here when using Float64 instead of Float32, and I’m not sure how relevant that is.

Adriel · November 29, 2019, 12:45am

It’s probably the conversion from BitVector to Float32. Here’s another way to do it.

function mymul!(y, A, x::BitVector)
    y .= 0
    for j in 1:length(x)
        @inbounds x[j] && (y .+= @view A[:,j])
    end
end

Quite a bit faster.

Topic		Replies	Views
Why mul! is so fast? General Usage question , linearalgebra	7	6871	November 26, 2019
BitVector vs Vector{Bool} as default on comparison operations Performance	8	9173	November 2, 2020
Multiplication of `BitMatrix`s for linear algebra modulo 2 Performance linearalgebra	10	587	February 29, 2024
Why is BitArray so slow? Performance	28	5838	January 14, 2022
Hand written loop slower than bitVector broadcast Performance question	15	706	January 18, 2024

Why mul! is so fast with BitVector?

Related topics