DataFrame: New columns with names from existing column

I’m trying to take an existing DataFrame that looks something like the following:

df = DataFrame(:x => [1, 2, 3, 1, 2], :y => [:A, :A, :A, :B, :B], :z => 1:2:9)
5×3 DataFrame
 Row │ x      y       z     
     │ Int64  Symbol  Int64 
─────┼──────────────────────
   1 │     1  A           1
   2 │     2  A           3
   3 │     3  A           5
   4 │     1  B           7
   5 │     2  B           9

and create a new one by 1) grouping by the values in column :y, 2) joining on the repeated values in :x and 3) creating multiple :z columns with new names pulled from :y groups. To illustrate, the output would look like

3×3 DataFrame
 Row │ x      A      B       
     │ Int64  Int64  Int64?  
─────┼───────────────────────
   1 │     1      1        7
   2 │     2      3        9
   3 │     3      5  missing 

I’ve tried a few different approaches with groupby and outerjoin but haven’t found the right one yet. This is the closest I’ve gotten

outerjoin(groupby(df, :y)..., on=:x, makeunique=true)
3×5 DataFrame
 Row │ x      y        z       y_1      z_1     
     │ Int64  Symbol?  Int64?  Symbol?  Int64?  
─────┼──────────────────────────────────────────
   1 │     1  A             1  B              7
   2 │     2  A             3  B              9
   3 │     3  A             5  missing  missing 
julia> unstack(df, :x, :y, :z)
3×3 DataFrame
 Row │ x      A       B       
     │ Int64  Int64?  Int64?  
─────┼────────────────────────
   1 │     1       1        7
   2 │     2       3        9
   3 │     3       5  missing

see unstack help for more explanations (this is good for doing pivot like operations from the spreadsheets world).

4 Likes