Converting Array of Named Tuples to DataFrames

I have an array of NamedTuples that I populate in a for loop. Eventually, I want to convert it to a DataFrame. If I initialize the array as NamedTuple[ ], the conversion fails. However, initializing as Any[ ] works, but I want to avoid Any[ ]. The code below provides a demonstration of this behavior. Why does it fail for NamedTuple[ ], but not for Any[ ] and is this intended?

using DataFrames
x1 = [(x=rand(),y=rand([:a,:b])) for i in 1:10]
d1 = DataFrame(x1)

x2 = Any[(x=rand(),y=rand([:a,:b])) for i in 1:10]
d2 = DataFrame(x2)

#Does not work
x3 = NamedTuple[(x=rand(),y=rand([:a,:b])) for i in 1:10]
d3 = DataFrame(x3)

I think it falls through the cracks of the Tables interface:

julia> Tables.istable(typeof(x3))

but then the schema is not defined.

BTW, “fails” is not very informative, an error message is helpful:

julia> DataFrame(x3)
ERROR: MethodError: no method matching Tables.Schema(::Type{NamedTuple})
Closest candidates are:
  Tables.Schema(::Any, ::Nothing) at /home/tamas/.julia/packages/Tables/Icwxo/src/Tables.jl:156
  Tables.Schema(::Any, ::Any) at /home/tamas/.julia/packages/Tables/Icwxo/src/Tables.jl:157
  Tables.Schema(::Tuple{Vararg{Symbol,N} where N}, ::Type{T<:Tuple}) where T<:Tuple at /home/tamas/.julia/packages/Tables/Icwxo/src/Tables.jl:154
 [1] schema(::Tables.DataValueUnwrapper{Array{NamedTuple,1}}) at /home/tamas/.julia/packages/Tables/Icwxo/src/query.jl:29
 [2] columns at /home/tamas/.julia/packages/Tables/zyTIO/src/fallbacks.jl:152 [inlined]
 [3] DataFrame(::Array{NamedTuple,1}) at /home/tamas/.julia/packages/DataFrames/z2XOB/src/other/tables.jl:45
 [4] top-level scope at none:0

IMO a bug report for DataFrames.jl would be in order, but I am not sure what the best fix is.

Thanks for your explanation. I will submit an issue to DataFrames since it does not seem like desired behavior.

A link to the issue for future reference:

1 Like