I’m trying to find a way to implement a static bit array that is performant and memory efficient. I wonder if someone has solved some of the issues below. For example, I want to be able to create and access them quickly and store them efficiently. Quick compared to other statically allocated things. By “static”, I mean not dynamically allocated. And I actually want to use collections (arrays and dictionaries) of these objects.
bitarray.jl
contains the implementation of Base.BitArray
, a dynamically allocated array, in 2000 lines. At present I don’t need much of the array interface. I do want to support shifting, bitwise operations, displaying as bits, and a few other things.
For some purposes it seems like the following is best.
- Wrap fixed-sized integers in BitIntegers.jl. The latter allows me to define
UIntX
whereX
can be really big. - The width of an unsigned ints in
BitIntegers.jl
has to be a multiple of 8. So, I may need to store the number of those bits that I am actually using if I want to represent arrays without this restriction. So I create a struct containing two fields, aVector
of my wrappedUIntX
andn
the (common) width of each bit array in the vector. I do the same for aDictionary
fromDictionaries.jl
.
I have some of this implemented in BitsX.jl In particular, this code quickly creates the appropriate UInt
type if it does not exist.
I also use Bits.jl which defines
struct BitVector1Mask{T<:Real} <: AbstractBitVector1
x::T
len::Int
end
But this is wasteful if I want an Array
of BitVector1Mask
where len
is the same for each element.
Instead, I tried
struct StaticBitVector{T<:Real, N} <: Bits.AbstractBitVector1
x::T
end
which stores the bit width in the parameter N
. However creating a StaticBitVector
this way is usually more than 100
times slower unless a literal is substituted for N
, or if Val
is used (and used carefully), which is not always possible.
These problems, with BitVector1Mask
and StaticBitVector
, are why I am leaning toward a Vector
of fixed-width unsigned integers plus a single field for the common bit width. However, if you get a value from the Vector
, you have lost the information on the bitwidth, which introduces complications. I think if you write code very carefully, you can have getindex
semantically construct AbstractBitVector1
on the fly, but in fact have the compiler elide the construction. But that will be fiddly and require that you always use it correctly to be performant, and the only way to be sure that you are is to benchmark.