How to save an array to disk in compressed form?

Hello, how to save an array to disk in compressed form?

How about HDF5 or FITS formats? You may also consider JLD.

2 Likes

I just asked almost the same question. :slight_smile: Is Append to zipped CSV file of any help? The compression is not much compared to a binary format, though. A rough binary format is https://docs.julialang.org/en/v1/stdlib/Serialization/. Also search for threads on this site, like Binary output, How store this variable into files, and reaload it?, or just search for HDF5 on this site.

1 Like

If you store your array in DataFrames format (or any Tables.jl compatible format) then you can use JDF.jl.

If you don’t need interop with R then Blosc.jl is quite good.

uncompressed = rand(1_000_000)
using Blosc
compressed = compress(uncompressed)

using Serialization
serialize("somewhere.jls", compressed)

# to read it back
compressed_read_back = deserialize("somewhere.jls")
decompressed = Blosc.decompress(Float64, compressed_read_back)

decompressed  == uncompressed  # true
1 Like

See this github comment, which works for generic data not just arrays.
Before running the code, first run

using TranscodingStreams, CodecZstd

Bear in mind, JDIF does not support missing / nothing

1 Like