Efficient static bit array

I’m trying to find a way to implement a static bit array that is performant and memory efficient. I wonder if someone has solved some of the issues below. For example, I want to be able to create and access them quickly and store them efficiently. Quick compared to other statically allocated things. By “static”, I mean not dynamically allocated. And I actually want to use collections (arrays and dictionaries) of these objects.

bitarray.jl contains the implementation of Base.BitArray, a dynamically allocated array, in 2000 lines. At present I don’t need much of the array interface. I do want to support shifting, bitwise operations, displaying as bits, and a few other things.

For some purposes it seems like the following is best.

  • Wrap fixed-sized integers in BitIntegers.jl. The latter allows me to define UIntX where X can be really big.
  • The width of an unsigned ints in BitIntegers.jl has to be a multiple of 8. So, I may need to store the number of those bits that I am actually using if I want to represent arrays without this restriction. So I create a struct containing two fields, a Vector of my wrapped UIntX and n the (common) width of each bit array in the vector. I do the same for a Dictionary from Dictionaries.jl.

I have some of this implemented in BitsX.jl In particular, this code quickly creates the appropriate UInt type if it does not exist.

I also use Bits.jl which defines

struct BitVector1Mask{T<:Real} <: AbstractBitVector1
    x::T
    len::Int
end

But this is wasteful if I want an Array of BitVector1Mask where len is the same for each element.
Instead, I tried

 struct StaticBitVector{T<:Real, N} <: Bits.AbstractBitVector1
    x::T
end

which stores the bit width in the parameter N. However creating a StaticBitVector this way is usually more than 100 times slower unless a literal is substituted for N, or if Val is used (and used carefully), which is not always possible.

These problems, with BitVector1Mask and StaticBitVector, are why I am leaning toward a Vector of fixed-width unsigned integers plus a single field for the common bit width. However, if you get a value from the Vector, you have lost the information on the bitwidth, which introduces complications. I think if you write code very carefully, you can have getindex semantically construct AbstractBitVector1 on the fly, but in fact have the compiler elide the construction. But that will be fiddly and require that you always use it correctly to be performant, and the only way to be sure that you are is to benchmark.

1 Like

I think Static and Bit work against each other in some subtle ways:

  • StaticArray doesn’t save storage size, BitArray does
  • as a tradeoff, BitArray getindex time is slow because you have to mask them
  • masking increase registry pressure
  • StaticArray doesn’t have this extra overhead, if you just store StaticArray{UInt8} it’s fastest.
1 Like