Hello,
I have a program that outputs numerical data. Currently I’m appending into a CSV file so that the progress is saved even if the program is suddenly terminated. Please find the MWE below.
How could I output to an zipped CSV file? I have looked at various packages but could no make it work.
using DelimitedFiles
open("output.csv", write=true, truncate=true) do io
end
x = zeros(100)
for sdx = 1:1000
# Compute x .... takes minutes.
x .= randn(100)
open("output.csv", write=true, append=true) do io
writedlm(io, vec(x)', ',')
end
end
I think that gz supports appending:
- you just open a stream to append to a file,
- make a gzip output stream, eg with CodecZlib.jl,
- write to that stream.
Make sure to test this for critical data though.
1 Like
Thank you for pointing me in the right direction!
using CodecZlib
using DelimitedFiles
open("output.csv.gz", write=true, truncate=true) do io
end
x = zeros(100)
for sdx = 1:1000
# Compute x .... takes minutes.
x .= randn(100)
fout = open("output.csv.gz", write=true, append=true)
fout = GzipCompressorStream(fout)
writedlm(fout, vec(x)', ',')
close(fout)
end
2 Likes
Another options is to output to a binary format like JDF, Parquet or feather via Parquet.jl and Feather.jl. You will get better features like selectively load columns and compressed file size (for parquet and JDF).