So I had this idea, and I was going to make an issue on a requisite package for it, then I realized the context of it is potentially broader then any single package.
So in the geospatial domain a trick for reducing file size is to cleave off floating point noise from the digits that are significant in all stored files (well known trick). Now, in many other domains this is also true! Some datastores only need 4-8 byte precision, but after say manipulation they end up with all this floating point junk.
Is there a foreseeable benefit for a widespread pattern to be implemented in say CSV.jl, JSON.jl, XLSX.jl, JuliaDB.jl, parquet.jl, etc for allowing a “Significant digit” truncation? I realize this is one of those semidangerous things, if an end user “thinks they know” the truncation level and messes up well that sucks for them, but if you genuinely know, it’s kind of a godsend for large swaths of I/O.
I could imagine some type wrappers for some of these columnar I/O libraries specifying the significant digits for each column to save on bytes. Maybe some formats are already handling this to some extent?
Anyways, just a passing thought I had while trying to optimize my own ZMQ streams…