Unexpected dimension mismatch when using read!(io, BitArray)

I am running into a strange DimensionMismatch error when using read! to fill a BitArray. The offending lines look like this:

frames = BitArray(N_bits_per_frame, N_frames)
read!(joinpath(root_dir, bin_file), frames)

The strange thing is, this is part of a loop, and depending on the values of N_bits_per_frame and N_frames the line executes just fine. However, the values below cause an error:

frames = BitArray(480, 89893)
read!(joinpath(root_dir, bin_file), frames)

Julia outputs
ERROR: LoadError: DimensionMismatch("read mismatch, found non-zero bits after BitArray length")
This is especially perplexing because other combinations of N_bits_per_frame and N_frames don’t produce an error. For example:

frames = BitArray(480, 89896)
read!(joinpath(root_dir, bin_file), frames)

runs just fine.

Is this a bug? It seems that read!(s::IO, B::BitArray) operates by filling B’s chunks, but then shouldn’t it intelligently truncate to B’s initial dimensions? What am I missing here?

I just ran into this bug. It appears to be triggered if the length of the BitArray (i.e. for a 2D bitarray the number of rows times the number of columns) is not divisible by 64 (which is the underlying chuck size). For your examples:

julia> mod(480*89893, 64)
32

julia> mod(480*89896, 64)
0

The first length is not divisible cleanly by 64 and throws this error. It looks the error is thrown because we keep reading the entire 64 bits of the last chunk so we read past the end of the “real” data and the extra data is then thrown out. The proper solution could be:

  1. Move the stream back to where the “real” data ends, but could lead to an EOF
  2. Read in chunks until the very last chunk and then read bit by bit

How is the file you are reading from filled in?
from /base/bitarray.jl :

function read!(s::IO, B::BitArray)
    n = length(B)
    Bc = B.chunks
    nc = length(read!(s, Bc))
    if length(Bc) > 0 && Bc[end] & _msk_end(n) ≠ Bc[end]
        Bc[end] &= _msk_end(n) # ensure that the BitArray is not broken
        throw(DimensionMismatch("read mismatch, found non-zero bits after BitArray length"))
    end
    return B
end

Obviously, Julia expects a bit array to be written and read as a Uint64 array without being able to write parts of two adjacent bit arrays to the same chunk

It’s filled in by imagemagick. I’m not sure about the exact implementation.

But in my case, the length of the bit array is 319774 which is not divisible by 8 or 64 and it’s filled like so:

0xffffffffffffffff

instead of the expected

Base._msk_end(n) #0x000000003fffffff

Do you need the reminder of last chunk (for example as beginning of next chunk)?
If not, then you can do something like

ba = BitArray(undef,  Int64(ceil(319774 / 64) * 64))
read!(io, ba)
resize!(ba, 319774)