Iterating on bits

How do I iterate over the bits of a UInt8, UInt16 or UInt64? I tried

julia> for x in digits(Bool, UInt16(17),2) ...

It works but it creates an intermediate array, I am not sure this is how it should be done.

There is the digits! function which at least allow you to re-use the allocated array. I’ve a not-yet-published package for this kind of things, I’ll try to remember to link it here when it’s ready.

Do you actually want to iterate over the bits? Or the digits? Here’s an example of a simple BitIterator:

struct BitIterator{T}
    val::T
end
Base.start(b::BitIterator) = 1
Base.next(b::BitIterator{T}, i) where {T} = ((b.val & T(2)^(sizeof(T) * 8 - i)) == 1, i + 1)
nbits(T) = sizeof(T) * 8
Base.done(b::BitIterator{T}, i) where {T} = i > nbits(T)

Which can be used like

julia> b = BitIterator(UInt8(1))
BitIterator{UInt8}(0x01)

julia> bits(UInt8(1))
"00000001"

julia> for i in b
           println(i)
       end
Loop variable "i" overwrites a variable in an enclosing scope. In the future the variable will be local to the loop instead.
false
false
false
false
false
false
false
true

julia> b = BitIterator(UInt64(1))
BitIterator{UInt64}(0x0000000000000001)

julia> bits(UInt64(1))
"0000000000000000000000000000000000000000000000000000000000000001"

julia> for i in b
           println(i)
       end
Loop variable "i" overwrites a variable in an enclosing scope. In the future the variable will be local to the loop instead.
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
false
true
3 Likes

For those wondering how you’d do this nowadays:

struct BitIterator{T}
    val::T
end
function Base.iterate(S::BitIterator{T}, i = 1) where {T}
    nbits = sizeof(T) * 8
    if i > nbits
        return nothing
    end
    return ((S.val & T(2)^(nbits - i)) == 1, i + 1)
end

Here’s another solution:

bits(x) = (x>>i & 1 for i in 0:8*sizeof(x)-1)

julia> bits(Int8(17)) |> collect
8-element Vector{Int64}:
 1
 0
 0
 0
 1
 0
 0
 0

Edit: I originally wrote sizeof(x) instead of sizeof(x)-1, thanks @stevengj for catching it.

1 Like

Rather than computing 2^i on each iteration, it should be more efficient to do a shift of a bitmask on each iteration:

struct BitIterator{T<:Integer}
    val::T
end
Base.length(itr::BitIterator) = sizeof(itr.val) * 8
Base.eltype(itr::BitIterator) = Bool
function Base.iterate(itr::BitIterator{T}, mask=one(T)) where {T}
    iszero(mask) && return nothing
    return !iszero(itr.val & mask), mask << 1
end

which gives

julia> collect(BitIterator(Int8(17)))'
1Ă—8 adjoint(::Vector{Bool}) with eltype Bool:
 1  0  0  0  1  0  0  0

@sijo gave a related solution using a generator, but

has the wrong length — it should be 0:8*sizeof(x)-1

3 Likes

(Note, however, that none of these solutions work for BigInt, for which sizeof is not the number of bytes in the value and for which repeatedly shifting left never produces zero. You would want specialized BitIterator{BigInt} methods to handle that case.)

That’s a really nice solution, @stevengj! On the back of this, what would you do if you wanted to iterate over bits of bytes? Would you just create a generator using BitIterator and flatten it?

bytes = [0x02, 0x01, 0x03]
Iterators.flatten(i for i in BitIterator.(bytes)) |> collect

Or would you create another struct for such a case?

I would re-think my life choices, since this is a weird programming pattern? Or use a BitVector?

That works too, but you have to be consistent about the order in which you want to iterate. BitIterator goes from least-significant (“rightmost”) to most-significant, so the analogous thing would probably be:

Iterators.flatten(BitIterator(b) for b in Iterators.reverse(bytes)) |> collect

Note that I would avoid BitIterator.(bytes), which allocates an unnecessary intermediate array of BitIterator.

3 Likes

The reason I ask is, for a bit of fun, I’m implementing some LSB-steganography in Julia, based on this, but I am trying to use fewer strings. As such, I am trying to use bytes and bits and such. I wanted to put the bits in a BitVector so that I can zip it with the CartesianCoordinates of the image matrix. Previously I was using

vcat((digits(b, base = 2, pad = 8) for b in bytes)...)

But that seems inefficient if an iterator exists…

Aside: One of the nice things about implementing an iterator type, as opposed to using a generator, is that you can easily define additional methods on it, e.g. you can make it indexable to fetch bits with itr[bit]:

Base.firstindex(itr::BitIterator) = 0
Base.lastindex(itr::BitIterator) = length(itr)-1
Base.getindex(itr::BitIterator{T}, i::Integer) where {T} = !iszero(itr.val & (one(T) << i))

and in principle you could define other specialized methods, e.g. define count(iszero, itr) using count_zeros(itr.val).

(But at some point you are re-implementing a static-length BitVector here.)

2 Likes

You realize that BitVector is an actual type in Julia, right? You can use this if you want a “bit-packed” array of boolean (bit) values, of arbitrary length, accessed like an array.

2 Likes

Yes, I know it’s a type in Julia, however I don’t have an array of booleans, I have an array of bytes (i.e., Vector{UInt8}), which is why your BitIterator type is super useful in this case :slight_smile:

You have an array of bytes that you want to interpret as an array of bits. Why not store these bits in a BitVector instead (which under the hood uses an array of integer “words” to store the bits)?

(Though it is typically faster to work with an array of Bool values — one byte per bit. This costs you some storage but is typically more efficient to work with.)

2 Likes

That’s what I’d like to do…only I’m not sure how to convert Vector{UInt8} or Base.CodeUnits into a BitVector… That said, would it not be better if I only had an iterator? Then I don’t have to store a (potentially large) BitVector in memory?

What is the source of your data? Why not read it into a BitVector (or BitMatrix for an image) to start with?

1 Like

I’m not sure what my source data is going to be. I’m implementing the function for Vector{UInt8}, as that seems standard, but to be honest I’m not sure what would be most useful in “real life”. I mentioned CodeUnits because I figured steganography would probably mostly be used to hide written information, so if you had a message (i.e., a string), you could convert it to bytes using codeunits.

For the fun of it:

julia> function bits(x)
         g = (x>>i & 1 for i in 0:8*sizeof(x)-1)
         eval(:(Base.getindex(g::typeof($g), i) = g.f(i-1)))
         return g
       end;

julia> g = bits(Int8(7));

julia> join(g)
"11100000"

julia> g[3], g[4]
(1, 0)

(Note that this particular method might be implemented by default if this PR gets merged.)

4 Likes