Possibly confusing or inconsistent methods for read()



Ran into this gotcha today and felt like I should flag it down for anyone else who runs into it.

From the v0.5.1 docs, read() has the following two methods:

read(stream::IO, T, dims)

Read a series of values of type T from stream, in canonical binary representation. dims is either a tuple or a series of integer arguments specifying the size of the Array{T} to return.

read(s::IO, nb=typemax(Int))

Read at most nb bytes from s, returning a Vector{UInt8} of the bytes read.

What confused me here is that the first method must read prod(dims) * sizeof(T) bytes, whereas the second method may read at most nb bytes. In the 1D case, read(stream, nb) looks very similar to read(stream, T, numT) (and even does the same thing when T == UInt8.)

To get around this, use the second method in coordination with reinterpret(), like so:

temp = read(stream, numT * sizeof(T))
data = reinterpret(T, temp) # possibly followed by reshaping.

I’m open to other suggestions for reading at most numT bitstypes from a stream.