Are there file formats that support saving of metadata (global and columnar)?
1 Like
Arrow.jl does.
1 Like
Just as a side note, TableMetaDataTools.jl has utilities to save metadata as toml files and then re-load them and attach them to a data frame. So in theory you could save a .csv and a .toml together and load them both.
It won’t handle cross-language metadata obviously (though maybe someone should write an R package…)
Are you sure? I’ve tried and it doesn’t seem to retain the metadata:
using Arrow
using DataFrames
df = DataFrame(a = 1:3, b= 'A':'C')
Arrow.write("test.arrow", df)
df = DataFrame(Arrow.Table("test.arrow"))
colmetadata!(df, :a, "test", "hope this works"; style = :note)
colmetadata(df, :a, "test")
Arrow.write("test2.arrow", df)
df = DataFrame(Arrow.Table("test2.arrow"))
colmetadata(df, :a, "test")
ERROR: ArgumentError: no column-level metadata found for column "a"
Stacktrace:
[1] colmetadata(df::DataFrame, col::Symbol, key::String, default::DataFrames.MetadataMissingDefault; style::Bool)
@ DataFrames ~/.julia/packages/DataFrames/kcA9R/src/other/metadata.jl:367
[2] colmetadata
@ ~/.julia/packages/DataFrames/kcA9R/src/other/metadata.jl:360 [inlined]
[3] colmetadata(df::DataFrame, col::Symbol, key::String)
@ DataFrames ~/.julia/packages/DataFrames/kcA9R/src/other/metadata.jl:360
[4] top-level scope
@ ~/Documents/GitHub/ItsLivePlayground.jl/src/RiverTest.jl:41
Oh, that’s good to know.
Do you know of a file format that is able to store the metadata in the same file as the table?
No, no one has made that file format.
I found this thread here on how to append metadata to an Arrow file