How to construct a DataFrame with a loop?

#1

Hello,

I need to construct a DataFrame similar to:

df=DataFrame([
            String, String, String, ...
            ], 
            [
            :cod_1, :cod_2, :cod_3, cod_... 
            ],
            0);

where the number of String is huge. So, by supposing that this number is N, how to construct df with a loop similar to:

df=DataFrame([
            for k in 1:N String
            ], 
            [
            for k in 1:N string("cod_",k) 
            ],
            0);

Of, course, the previous code does not work. Could you tell me the right way for such a code ? Thanks !

0 Likes

#2

Please see PSA: how to quote code with backticks (and update your post accordingly).

0 Likes

#3

It is done ! Thanks.

0 Likes

#4

I’m not sure, if I missed something, but this is valid code:

julia> n = 7
7

julia> DataFrame([String for k in 1:n],
                 [Symbol("cod_$k") for k in 1:n],
                 0)
0×7 DataFrame
0 Likes

#5

Thanks ! It works ! Nevertheless, how to use it when we need to have Date and String. More precisely, this bad code does not work:

DataFrame([Date, String for k in 1:2],
                 [:val_date, Symbol("cod_$k") for k in 1:2],
                 0)

Of course,

DataFrame([Date, String, String, ...],
                 [:val_date, :cod_1, :cod_2, ...],
                 0)

is OK.

I found a solution by using 2 dataframes and the join function:

df_1=DataFrame([Date, String],
                 [:val_date, :cod_1],
                 0)
df_2=DataFrame([String for k in 1:7],
                 [Symbol("cod_$k") for k in 1:7],
                 0)
df=join(df_1,df_2,on=:cod_1)

But it seems to me that it could be more simple. Is it the case ?

0 Likes

#6

For passing the Vector{DataType},

vcat(Date, fill(String, 2)) # Example [Date, String, String]

You can use vcat for the names as well.

0 Likes

#7

Yes, and eventually think of using the broadcast feature for the column names.

julia> using DataFrames, Dates

julia> n = 7
7

julia> makecol(k) = Symbol("cod_", k)
makecol (generic function with 1 method)

julia> df = DataFrame([Date, fill(String, n)...],
                      [:valdate, makecol.(1:n)...],
                      0)
0×8 DataFrame
0 Likes

#8

For a more general way to do this, you can create an “empty” DataFrame with the schema that you want and then just push! Dicts or Named Tuples to add observations to it. I have done this with fair amounts of success when the creation of the DataFrame is less trivial.

1 Like