I needed a function that I’m calling “seek_binary_sequence”. I’m writing binary data files, and looking for a “magic sequence” of bytes. I think a reasonable implementation is as follows
function seek_binary_sequence(io::IOStream, seq::AbstractArray{<:Number,1})
atype = eltype(seq)
ix=1
while !eof(io) && ix <= length(seq)
anum = read(io, atype)
ix = (anum == seq[ix]) ? ix+1 : 1
end
end
My hunch is others will want to do this as well, which prompts 2 questions,
- Have I missed this implementation elsewhere in the core parts of Julia?
- Is this something people should implement on their own whenever they need it?
- Is this a reasonable candidate for adding to
Base
?
Thanks in advance!
This is not the same thing of what you want, but you can mimic the behavior with readuntil
as follows:
julia> buf = IOBuffer("foobarbaz")
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=9, maxsize=Inf, ptr=1, mark=-1)
julia> readuntil(buf, b"ob") # find magic bytes "ob"
2-element Array{UInt8,1}:
0x66
0x6f
julia> read(buf, String)
"arbaz"
I don’t know other simple ways. So, perhaps adding such a function, say seekuntil
, to the Base may be useful.
Also you must be careful when reading multiple bytes from an I/O stream, because the stream may assume the different byte order as you expect.
[EDIT] The code above is for Julia 0.7-dev. On Julia 0.6, you may write:
julia> buf = IOBuffer("foobarbaz")
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=9, maxsize=Inf, ptr=1, mark=-1)
julia> readuntil(buf, "ob")
"foob"
julia> String(read(buf))
"arbaz"
@bicycle1885 Thank you for letting me know about readuntil
. Also, that’s a good point about byte-order.
That seems like very similar functionality, except that it returns the contents up to (and including) the “magic sequence”. Wanting of seekuntil
instead of readuntil
may be somewhat rare, but the main advantage of seekuntil
should be lower memory usage.
That said, the issue of byte-order seems to be the best reason to restrict the delim
sequence to something like Vector{UInt8}
.