I am creating a lot of FlatBuffers, which I would like to store in S3 storage. Right now I use something akin to the following code to serialize the structure and put it to S3 bucket.
fbStruct = ...
fbBytes = FlatBuffers.bytes(FlatBuffers.build!(fbStruct))
AWSS3.put_s3(awstoken, bucket, path, fbBytes)
Is there a way to compress the fbBytes
vector using CodecZlib
? I have found a way to do that with some temporary file, however I don’t know how to make it all in memory.
Have you tried an IOBuffer
?
I have tried the IOBuffer
like this
stream = GzipCompressorStream(IOBuffer(fbBytes))
newBytes = read(stream)
close(stream)
however the results are different from what I get, when using some temporary file and then reading it back.
open(GzipCompressorStream, filename, "w") do stream
write(stream, fbBytes)
end
fileBytes = = open("a.fb.gz", "r") do f
read(f)
end
I am still getting my head around how the TranscodingStreams
, CodecZlib
and the base IO
work together, so maybe I am using it all wrong.
It seems there is a direct Array
API:
Well that was hiding in plain sight for me. Thanks for pointing that out, I somehow thought that the GzipCompressor
was deprecated, however that was the case with the old GzipCompression
types. Will try that straightaway.
transcode(GzipCompressor, data)
is the simplest way to compress in-memory data. However, it allocates a working space every time you try to compress a chunk of data. If you need to compress a lot of data chunks, you can avoid lots of allocations by reusing pre-allocating a compressor object as described in this example: https://bicycle1885.github.io/TranscodingStreams.jl/stable/examples.html#Transcode-lots-of-strings-1.