Way to store dict of DataFrame tables in a single file?

From the DataFrame docs, they say the best way to store DataFrames is through a CSV:

using DataFrames
using CSV

df = DataFrame(x = 1, y = 2)
CSV.write(output, df)

However, let’s say you want to have:

  • a dictionary of DataFrames (akin to sheets in an excel file)
  • that you save in a single file (so you don’t have like 50 of them)

How would you get the following code to not error out?

df = DataFrame(x = [1,2,3], y = [2,2,2])
dg = DataFrame(x = [4,0,4], y = [0,1,3])
dh = DataFrame(x = [4,2,1], y = [1,0,1])

cur_dict = Dict(
    :f => df,
    :g => dg,
    :h => dh
)
CSV.write("woof.csv", cur_dict)
MethodError: no method matching schema(::Dict{Symbol,DataFrames.DataFrame})
Closest candidates are:
  schema(::S, ::DataStreams.Data.Query{code,columns,e,limit,offset}) where {code, S, columns, e, limit, offset} at /Users/dan/.julia/v0.6/DataStreams/src/query.jl:217
  schema(::S, ::DataStreams.Data.Query{code,columns,e,limit,offset}, ::Any) where {code, S, columns, e, limit, offset} at /Users/dan/.julia/v0.6/DataStreams/src/query.jl:217
  schema(::DataFrames.DataFrame) at /Users/dan/.julia/v0.6/DataFrames/src/abstractdataframe/io.jl:224
  ...

Stacktrace:
 [1] #stream!#120(::Bool, ::Dict{Int64,Function}, ::Function, ::Array{Any,1}, ::Array{Any,1}, ::Void, ::Void, ::Array{Any,1}, ::DataStreams.Data.#stream!, ::Dict{Symbol,DataFrames.DataFrame}, ::Type{CSV.Sink}, ::String, ::Vararg{String,N} where N) at /Users/dan/.julia/v0.6/DataStreams/src/query.jl:544
 [2] (::DataStreams.Data.#kw##stream!)(::Array{Any,1}, ::DataStreams.Data.#stream!, ::Dict{Symbol,DataFrames.DataFrame}, ::Type{CSV.Sink}, ::String) at ./<missing>:0
 [3] #write#53(::Bool, ::Dict{Int64,Function}, ::Array{Any,1}, ::Function, ::String, ::Dict{Symbol,DataFrames.DataFrame}) at /Users/dan/.julia/v0.6/CSV/src/Sink.jl:136
 [4] write(::String, ::Dict{Symbol,DataFrames.DataFrame}) at /Users/dan/.julia/v0.6/CSV/src/Sink.jl:136
 [5] include_string(::String, ::String) at ./loading.jl:522

edit: would be ok with json, xml, or w/e for storage type too

Is there any reason you want to have it all in a single CSV? You could do this.

for (key, data) in cur_dict
    CSV.write("$key", data)
end

Use JLD2.jl or BSON.jl.

2 Likes