I have: Vector{UInt8}. I need: BitVector

I have

julia> z
3-element Array{UInt8,1}:
 0x66
 0x6f
 0x6f

What I’d like is the (24-element) BitVector representation of this sequence. However, my attempt to use reinterpret yielded a very confusing error:

julia> reinterpret(BitArray{1}, z)
ERROR: ArgumentError: cannot reinterpret Array{UInt8} to ::Type{Array{BitArray{1}}}, type BitArray{1} is not a bitstype
 in reinterpret(::Type{BitArray{1}}, ::Array{UInt8,1}, ::Tuple{Int64}) at ./array.jl:87
 in reinterpret(::Type{BitArray{1}}, ::Array{UInt8,1}) at ./array.jl:75

Is there an efficient way to get a BitVector from z?

(Edited to add: discount any endianness issues for the moment; I can probably work around those.)

1 Like

As mentioned on gitter, here’s an iterative approach:

zs = [0x1, 0x2, 0x3]
b = BitVector()
for z in zs
    append!(b, [z & (0x1<<n) != 0 for n in 0:7])
end

Possibly you may figure something out with creating a bitvector of the target size with BitVector(24) for 24 bits, then assign 8 values via b[1:8] = ...; b[9:16] = ... etc.
But I didn’t find a fast and easy way to make a bitvector from each bit of an UInt8 in one shot.

Edit
Actually, the combination of the above also appears to work with broadcasting:

z = 0x12
b[1:8] .= [z & (0x1<<n) != 0 for n in 0:7]

Looks a bit fancier.

2 Likes

reinterpret takes the element type as the first value but there is no element type Bit.

BitArray is a composite type, not a bitstype so reinterpret won’t work directly.

  type BitArray{N} <: DenseArray{Bool,N}

  Fields:

  chunks :: Array{UInt64,1}
  len    :: Int64
  dims   :: Tuple{Vararg{Int64,N}}

You could do something like the following (but it does seem a bit hacky, so not really sure if recommended)

a = UInt8[0x66, 0x6f, 0x6f]
b = append!(copy(a), fill(UInt8(0), 5)) # pad to UInt64 size
c = reinterpret(UInt64, b)
B = BitVector(24)
B.chunks = c
function make_bitvector(v::Vector{UInt8})
    siz = sizeof(v)
    bv = falses(siz<<3)
    unsafe_copy!(reinterpret(Ptr{UInt8}, pointer(bv.chunks)), pointer(v), siz)
    bv
end

function make_bitvector(v::Vector{UInt8}, dim::Integer)
    siz = sizeof(v)
    (((dim + 63) >>> 6) << 3) < siz && error("$dim too small for size $siz vector")
    bv = falses(dim)
    unsafe_copy!(reinterpret(Ptr{UInt8}, pointer(bv.chunks)), pointer(v), siz)
    bv
end

These functions should do the job. The second form allows you to set the number of bits in the resulting BitVector, instead of just having it be the size of the input vector of bytes * 8 (it may need a bit more work to mask off bits > number of bits set, but it’s a start.
These should be faster than the other approaches discussed here.

2 Likes

Why? The resulting BitVector won’t return elements with index greater than dim, will it?

I wasn’t sure if some of the BitVector operations might have assumptions such as the chunks only contain 1 bits for valid bits (this makes doing things like counting the 1 bits slightly easier)