How to create `DataFrame` from using NamedTuple keys as column names

I have a function that returns a NamedTuple with 7 entries. The names are obviously Symbols and the entries are Int64. As this function is run many times, I would like to store the returned data as rows in a DataFrame, but I am having trouble creating a DataFrame to begin with.

Here is a MWE

  using DataFrames
    ntd = (A = 2, B = 3, C = 4)   
    DataFrame(keys(ntd)) ## OR
    DataFrame(String.(keys(ntd)))

which results in error
ERROR: ArgumentError: 'NTuple{7,Symbol}' iterates 'Symbol' values, which don't satisfy the Tables.jl Row-iterator interface

I know this I can manually construct the DataFrame by typing in my column names but is there a way to do this programtically?

Should’ve searched Google some more. I found the answer.

dfnames = [keys(nt)...]
DataFrame([Int64 for i = 1:7], dfnames, 0)

where nt is my NamedTuple.

Even easiert: DataFrame([ntd]).

2 Likes

You do not have to initialize a DataFrame with columns in this case. You can push! a NamedTuple even to a DataFrame without columns. E.g.

df = DataFrame()
push!(df, (a=1,b=2))
push!(df, (a=3,b=4))

just works.

The only case when you would need this is when you expect values of heterogeneous types in columns (in which case a NamedTuple will have an overly narrow type the first time you push! it to a DataFrame). In this case you can use a constructor expecting a vector of types and a vector of column names, e.g. ataFrame([Int, Float64], [:a, :b]).

2 Likes

Thanks. Infact, the columns are heterogeneous in data so just constructing an empty DataFrame didn’t work for me.