Hello,
Please advise a reliable, stable and simple package for long-term data storage in a universal cross-platform format?
it is desirable to support dataframe
Hello,
Please advise a reliable, stable and simple package for long-term data storage in a universal cross-platform format?
it is desirable to support dataframe
Probably CSV is the best option for your requirements.
You can read/ write them efficiently with the CSV.jl package.
Reliable - CSV, Parquet, Serialization, JDF
Stable - CSV, Parquet,
Simple - CSV, Parquet, JDF
Long term - CSV, Parquet
Universal Cross platform - you mean OS or langauge? CSV, Parquet
JDF.jl is what I would use in the Julia ecosystem and given the recent interest. I will try to update it soon. I see a way forward for it to multi-language too.
Thank you!
So can you tell me why JDF reads so fast compared to boxed serialization? ))
Compression.
Most of the time in reading a file is taken up by reading from hard drive . As the hard drive (even ssd) are slow compared to RAM. So if you compress the data so you read less from hard drive, and trade it off with decompression time on CPU. Overal it is still faster.
@andrey2185 https://github.com/xiaodaigh/JDF.jl/issues
If you like you can write down the key features you want in such a format so I can note it down. I am close to getting more done for JDF.