Are there potential benefits of writing to an IOBuffer() rather than printing to a file IO directly? For example, I see the following implementation,
function append(io::IO, data::SomeCompositeType)
# append a mutable struct to an IO
end
data_str = String(take!(append(IOBuffer(),element)))
open("data_file.txt", "w") do io
write(io, data_str)
end
Is the IOBuffer step necessary in the above example? What does it buy us? Thanks.
Mainly, IOBuffer is used in circumstances where you don’t just want to output to a file, e.g. if you want to output to a string, or you want to preprocess the data before it is written.
I don’t see much point in using an intermediate IOBuffer if you’re just going to dump it straight into a file. (Presumably file I/O is already buffered internally.)
using BufferedStreams
x = "id".*string.(rand(UInt16,100_000_000))
fn(x) = begin
io = BufferedOutputStream(open("c:/data/bin.bin", "w"))
write.(Ref(io), x)
close(io)
end
gn(x) = begin
io = open("c:/data/bin2.bin", "w")
write.(Ref(io), x)
close(io)
end
using BenchmarkTools
@btime fn($x)
@btime gn($x)
See above. Hmmm, I heard from my Rust programmer friend that it’s still better to manager your own buffer with IO. He wrote this program for me that was 10x faster than anything on the market… so I think he knows his stuff
Writing to files in Rust is AFAIU completely unbuffered so using a BufWriter is crucial, while Julia should use libuv for buffering. If Julia didn’t use any buffering there would be a much larger difference than a factor of 2 here. It is interesting to see that there is a difference at all though, perhaps the BufferedOutputStream buffering is more efficient than the libuv buffering in this case.
Ah, this is on 1.3… Small writes on 1.3 are slower because they are now thread safe (and thus needs to lock for every write). BufferedStreams is not thread safe so it avoids the overhead of locking.
In some circumstances, using the IOBuffer is faster and more efficient to build a string from an object. For example, consider constructing a string by concatenating a String objects instead, then you would be creating a lot of new instances of String in the process. With IOBuffer those constructors are skipped with the hypothetical append because you are streaming characters into a single constructor, instead of calling multiple constructors to build up whatever String you are trying to get from the element object.
This is still valid 2 years later. Inserting it here just for reference My system is a Macbook Pro 2018; 2.5GHz Intel i7 and 16GB 1600 DDR3 RAM running on Big Sur 11.6.3
I am very new to Julia, so please correct me if I am wrong or use this not as intended.
I use IOBuffer when I need to concatenate larger binary strings for parsing. I come from python and I use it in Julia similarly to BytesIO from python. I am working with readers of proprietary data files where data can be chunked into pieces and pieces spread over whole file. Parsing of such file with direct IOStream (obtained with open) is painful, in particularly that it can be chunked with few more layers (chunked compression and encryption). Using IOBuffer allows to reduce complexity concatenating spread chunked data into continues data representation layer before going forward with next layer of parsing.