Primitive type with 2 bits

I have a very large matrix (with millions of rows) which obviously uses a lot of memory. Each of its entry it is either 0, 1 or 2. At the moment, the matrix has type Array{Int8,2} but 2 bits per entry would suffice, and would also reduce the memory allocation.

I tried to define a new type:

primitive type UInt2 <: Unsigned 2 end

but it throws an error. Is 8 bits is the minimum one can use?

Indeed. A DiBitVector might be useful to you: DiBitVector · DataStructures.jl

4 Likes

There’s also BitArrays: https://docs.julialang.org/en/v1/base/arrays/#Base.BitArray

edit: oops, I misread the OP, thought they just needed two values instead of three.

2 Likes

Yes, 8 bits is the minimum one can use for a primitive type.

However, you can still implement bit-packed data structures like BitArray or DBitVector as linked above. The way these works is to have arrays of e.g. UInt64 “chunks” (64-bit quantities) under the hood, but to provide accessor functions (e.g. getindex) that extract individual 1-bit or 2-bit elements from the bits of these “chunks”, respectively.

3 Likes

This ended up being the best solution for me. I did not want to define the matrix using a vector of DiBiVectors, and I preferred defining a BitArray matrix with twice the number of columns, storing the 2 bits information I needed in each 1x2 block of the matrix

1 Like