DataFrame or DataFrameRow to Dict and JSON

Wanted to double-check if there are any recommended ways to convert a DataFrame (or DataFrameRow) into a Dict (for eventually conversion to JSON). This is the inverse of this question.

“List of Dicts” JSON format would be ideal.

For example, converting:

df = DataFrame(x = 1:3, y = 4:6)
3×2 DataFrame
│ Row │ x     │ y     │
│     │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1   │ 1     │ 4     │
│ 2   │ 2     │ 5     │
│ 3   │ 3     │ 6     │

to

[{"x":1,"y":4}
,{"x":2,"y":5}
,{"x":3,"y":6}
]

Here’s my basic implementation. Was wondering if there’s anything more efficient out there.

function dataframe_to_json(df)
    li = []
    for row in eachrow(df)
        di = Dict()
        for name in names(row)
            di[string(name)] = row[name]
        end
        push!(li, di)
    end
    JSON.json(li)
end

df = DataFrame(x = 1:3, y = 4:6)
dataframe_to_json(df)

JSONTables.jl enables conversion from a Table (or DataFrame) to JSON and back again. However, the recovered table is often not the same as the input table because the package doesn’t preserve types.

Issue #6 in that package concerns this problem and offers a solution, though in its current form the proposed solution is particular to DataFrames (not all types that satisfy the Tables interface) and JSON (not JSON3).

Combining the 2 approaches gives us the best of both worlds - a package that converts between Tables and JSON in either direction without loss of information.

2 Likes