Saving Unitful DataFrame to file

Imagine I have a DataFrame with some of the columns being Unitful. Is there a way to save such a DataFrame into a file while preserving the information about the units?

The only solution I can think about is to strip it off units and save it to a .csv file. Then, when I load the file, I will have to multiply the columns by units.

But may be there is a good established approach to this?

1 Like

Does it need to be human readable? If not , Serialization.serialize works out of the box for DataFrames + Unitful. Serialization · The Julia Language

1 Like

No, it does not have to be human readable, but I would like loading the file to be fast.

serialization is pretty good then

By the way, does saving to .jld2 works as well?

I remember trying various serialization solutions and found Serialization to work best in terms of speed and serializing custom structs. Plus its a standard lib, which is a plus

For reference, a 450MB serialized dataframe file takes 109 ms to deserialize using Serialization.

1 Like

Everyone wants this (for the repl and e.g. writing to csv):
https://github.com/PainterQubits/Unitful.jl/issues/412
https://github.com/PainterQubits/Unitful.jl/issues/388
https://github.com/PainterQubits/Unitful.jl/pull/298
https://github.com/PainterQubits/Unitful.jl/issues/391
https://github.com/PainterQubits/Unitful.jl/issues/466

There is a PR too:
https://github.com/PainterQubits/Unitful.jl/pull/470

1 Like

There is a package for reversible converting :
UnitfulParsableString.jl
I’m using these functions for Arrows saving using that package:

function deunitful(df::AbstractDataFrame)
    for colname in names(df)        
        col = df[!,colname]
        firstval = skipmissing(col)[1]
        if firstval isa Number
            ustr = string(Unitful.unit(firstval))
            df[!,colname] .= ustrip.(df[!,colname])
            colmetadata!(df, colname, "units", ustr, style=:note);
            colmetadata(df, colname, "units")
        end
    end
    df
end

function reunitful(df::AbstractDataFrame)
    for colname in names(df)        
        if "units" in colmetadatakeys(df, colname)        
            ustr = colmetadata(df, colname, "units")
            un = Unitful.uparse(ustr)
            df[!,colname] .= df[!,colname] .* un
            deletecolmetadata!(df, colname, "units")
        end
    end
    df
end
3 Likes