ok, thank you.. I have read that page, but I thought that that limit was only on very old zip implementations, and modern implementations had 16 exabytes as limit…
Which are other options to save in a compressed way large DataFrames ?
EDIT: I managed to get a 5GB tgz using command line (I am in Linux), but I would prefer to have a OS independent way on my code…
EDIT2:
It seems the standard way to read/write compressed CSV files is to use GZip:
using DataFrames, CSV, CodecZlib
a = DataFrame(a=rand(4000),b=rand(4000))
CSV.write("a.csv.gz",a;compress=true)
a_copy = CSV.read("a.csv.gz",DataFrame)
a == a_copy # true
Strange it isn’t so easy to find it… I will now try to use it with my real dataset…