Column types in DataFrames

I have a big DataFrame named df.

I want to create, from it, another DataFrame with 2 columns: the first column should contain the column names of df. The second column should contain its column types.

I have tried this:

using DataFrames
df = DataFrame(a=Int[], b=String[], c=Union{Missing, Int})

function get_column_info(df::DataFrame)
     col_names = names(df)
     col_types = eltype.(eachcol(df))
     return DataFrame(Column_Names = col_names, Column_Types = col_types)
end

new_df = get_column_info(df)

And I get this:

image

I expected to see Union{Missing, Int64} for column “c”. That’s not what I got.

So I tried something different:

function get_column_info(df::DataFrame)
    col_names = names(df)
    col_types = [typeof(col) for col in eachcol(df)]
    return DataFrame(Column_Names = col_names, Column_Types = col_types)
end

new_df = get_column_info(df)

This time what I got is:

image

It is not what I expected.

Does anybody have any suggestions as to how to proceed?

Thank you

For consistency with the other dataframe entries, I think this should be:
c=Union{Missing, Int}[]

2 Likes

I think you are right.

The problem seems to be solved, thank you.

You might also like describe(df) and in particular describe(df, :eltype)

2 Likes