Fast copy from stream to IOBuffer?

I need to read parts of a (binary) stream, and concatenate them, so I can read from that concatenated buffer. I’m reconstructing a thing that was packetized into segments.

I see I can do write(iob, read(s, ...)), but I am afraid that will do two copies and an intermediate allocation. Allocating the result of the read, copying data once to the result, then copying it again to the IOBuffer. Is there a way to copy directly from the stream to the IOBuffer?

Even better, could the compiler notice and optimize this case so I can write the code above, but the compiler arranges to not materialize the result of the read?

I guess there are better answers, but if you know size of the binary chunk, you can use readbytes! function. This way you wouldn’t avoid allocating completely, but it can be irrelevant because it is one time allocation.

Here is an example (untested)

buf = Vector{UInt8}(undef, chunk_size)
while true
  nb = readbytes!(s, buf)
  write(iob, @view buf[1:nb])
  eof(s) && break
end

Good idea! That still copies the data twice, but it’s in the right direction…

write(iob, s)?

1 Like

[Back to my original response. Not coherent today…]

I don’t think that helps for my case. I don’t want the whole stream, just the next bytes – the body of the fragmented packet.

I can think of a way to manage the buffer I write to manually (if I know how big it needs to be). I just need a specialised write and read method for that manual buffer. I just allocate an array, keep track of where I am up to and use readbyte! into that array at the right location.

Input streams instead support the notion of “anchoring”, which instructs the stream to save the current position in the buffer. If the buffer gets refilled, then any data in the buffer including or following that position gets shifted over to make room. When the match is finished, one can then call takeanchored! return an array of the bytes from the anchored position to the currened position, or upanchor! to return the index of the anchored position in the buffer.

https://biojulia.net/BufferedStreams.jl/stable/inputstreams.html

Okay. So I want write(s::IO, from::IO, nb::Integer). I’ll implement it, when I get a chance, and submit a PR.

Looks like I can just take the code from readbytes_some! (called from read(s::IO, nb::Integer = typemax(Int))), and combine that with the code from write(s::IO, a::Array), skipping the intermediate array. It would be okay with me if someone got to it before me. :wink: