How to create an empty dataframe from a vector of row names and column names

Fred · June 25, 2019, 12:54pm

Hi,

I would like to create a large empty dataframe of type Floats from a vector of row names and column names. For example

rn=["row1", "row2", "row3"]
cn=["col1", "col2", "col3"]

Thanks !

johann.spies · June 25, 2019, 2:06pm

It not quite clear to me how a DataFrame containing many columns where the first column column contains many row names can be describe as “empty”.

Do you want this?

julia> df = DataFrame()
0×0 DataFrame

julia> df[:rowname] = rn
3-element Array{String,1}:
 "row1"
 "row2"
 "row3"
julia> for c in cn
          df[Symbol(c)] = 0.0
          end

julia> df
3×4 DataFrame
│ Row │ rowname │ col1    │ col2    │ col3    │
│     │ String  │ Float64 │ Float64 │ Float64 │
├─────┼─────────┼─────────┼─────────┼─────────┤
│ 1   │ row1    │ 0.0     │ 0.0     │ 0.0     │
│ 2   │ row2    │ 0.0     │ 0.0     │ 0.0     │
│ 3   │ row3    │ 0.0     │ 0.0     │ 0.0     │

Fred · June 25, 2019, 2:09pm

Thanks @johann.spies your solution is very good !

DoktorMike · June 25, 2019, 3:44pm

In your last example: why is the call to symbol necessary? I’m sure it is; I just don’t understand why.

pdeffebach · June 25, 2019, 3:49pm

DataFrames don’t index columns by strings, but rather by Symbols. There’s no super deep reason why it is that way other than that Symbols are more lightweight than strings and have some nice features.

Using a Symbol, for example, makes it easier for df.a to refer to column :a.

nalimilan · June 25, 2019, 4:07pm

Most importantly, symbols are faster to look up than strings since they are interned.

DoktorMike · June 25, 2019, 4:23pm

Got it. Thanks guys.

bkamins · June 25, 2019, 4:24pm

An alternative way to do it is:

julia> df = DataFrame(fill(Float64, length(cn)), Symbol.(cn), length(rn))
3×3 DataFrame
│ Row │ col1         │ col2         │ col3         │
│     │ Float64      │ Float64      │ Float64      │
├─────┼──────────────┼──────────────┼──────────────┤
│ 1   │ 1.0735e-313  │ 1.07357e-313 │ 1.0735e-313  │
│ 2   │ 6.61729e-316 │ 7.6592e-316  │ 3.53922e-316 │
│ 3   │ 7.6592e-316  │ 7.6592e-316  │ 3.53922e-316 │

julia> df.rowname = rn
3-element Array{String,1}:
 "row1"
 "row2"
 "row3"

julia> df
3×4 DataFrame
│ Row │ col1         │ col2         │ col3         │ rowname │
│     │ Float64      │ Float64      │ Float64      │ String  │
├─────┼──────────────┼──────────────┼──────────────┼─────────┤
│ 1   │ 1.0735e-313  │ 1.07357e-313 │ 1.0735e-313  │ row1    │
│ 2   │ 6.61729e-316 │ 7.6592e-316  │ 3.53922e-316 │ row2    │
│ 3   │ 7.6592e-316  │ 7.6592e-316  │ 3.53922e-316 │ row3    │

A side note for the other solution is that using df[:x] = 0.0 syntax will soon be not recommended and it will become df[:x] .= 0.0 soon (using broadcasting).

Fred · June 25, 2019, 4:37pm

Thank you @bkamins your solution is great too !

I have a question : what is the best way to change some values in the dataframe using the colnames and the rownames ? I tried something like
df[[:rowname == "row1"], :col3] = 5

bkamins · June 25, 2019, 4:45pm

I assume the row-name is unique. In this case what you should do is:

df[findfirst(=="row1", df.rowname), :col3] = 5

Alternative way to write is e.g.:

df.col3[df.rowname .== "row1"] .= 5

This will also work if the column names are not unique.

Fred · June 25, 2019, 4:47pm

@bkamins very interesting ! thanks !

Fred · June 27, 2019, 6:29am

It is also possible to change one column at once

julia> df.col3 .= [1,2,3]
3-element Array{Float64,1}:
 1.0
 2.0
 3.0

julia> df
3×3 DataFrame
│ Row │ col1         │ col2         │ col3    │
│     │ Float64      │ Float64      │ Float64 │
├─────┼──────────────┼──────────────┼─────────┤
│ 1   │ 9.88131e-324 │ 6.92857e-310 │ 1.0     │
│ 2   │ 4.94066e-324 │ 6.92857e-310 │ 2.0     │
│ 3   │ 6.92859e-310 │ 6.92857e-310 │ 3.0     │

Topic		Replies	Views
Creating an empty dataframe from a vector of strings fails? New to Julia dataframes	8	832	April 25, 2022
How to make empty dataframe with column names? Data	5	7296	March 17, 2022
Create dataframe with n columns of strings General Usage	6	1945	February 4, 2021
Construct Julia Dataframe from row data New to Julia question , dataframes , data_structures	11	6169	March 21, 2020
Initializing a dataframe New to Julia	23	10645	March 15, 2020

How to create an empty dataframe from a vector of row names and column names

Related topics