How to construct a dataframe of dataframes

Say, I was trying to create a 3-d Dataframe, but seems like Dataframes can only be 2-d. Someone suggested I create “Dataframe of Dataframes”, where each entry of a Dataframe is itself a dataframe. I was not sure how to do this. I tried a few different syntaxes, but did not seem to work.

df = DataFrame(DataFrame(rand(4,4), 5)

This syntax will not work. I basically need a dataframe with 5 entries, where each entry is a 4x4 Dataframe.

I would like to index the resulting Dataframe of Dataframes like
df[:col_name][:depth_name][n]

I saw this post but this looks like an array of Dataframes as opposed to a Dataframe of Dataframes. Now this might work, but just need to understand how to index it.

Yeah, you have a few different options here:

Array of DataFrames:

dfs = DataFrame[]
push!(dfs, DataFrame(rand(4, 4)))
push!(dfs, DataFrame(rand(4, 4)))

df[1] # get 1st DataFrame
df[2] # get 2nd DataFrame

If your DataFrames all have the same columns you might do something like:

df1 = DataFrame(a=[1,2,3], b=[4,5,6])
df2 = DataFrame(a=[7,8,9], b=[10,11,12])

df_all = DataFrame(a=[df1.a, df2.a], b=[df1.b, df2.b])

df_all.a[1] # access df1.a
df_all.a[2] # access df2.a
1 Like

Oh I see. So you are saying that the different columns of the outer dataframe are matched to the inner dataframes. That is why a=[df1.a, df2.a]. Let me play around with this. Thanks for your help.