I’m trying to take an existing DataFrame that looks something like the following:
df = DataFrame(:x => [1, 2, 3, 1, 2], :y => [:A, :A, :A, :B, :B], :z => 1:2:9)
5×3 DataFrame
Row │ x y z
│ Int64 Symbol Int64
─────┼──────────────────────
1 │ 1 A 1
2 │ 2 A 3
3 │ 3 A 5
4 │ 1 B 7
5 │ 2 B 9
and create a new one by 1) grouping by the values in column :y
, 2) joining on the repeated values in :x
and 3) creating multiple :z
columns with new names pulled from :y
groups. To illustrate, the output would look like
3×3 DataFrame
Row │ x A B
│ Int64 Int64 Int64?
─────┼───────────────────────
1 │ 1 1 7
2 │ 2 3 9
3 │ 3 5 missing
I’ve tried a few different approaches with groupby
and outerjoin
but haven’t found the right one yet. This is the closest I’ve gotten
outerjoin(groupby(df, :y)..., on=:x, makeunique=true)
3×5 DataFrame
Row │ x y z y_1 z_1
│ Int64 Symbol? Int64? Symbol? Int64?
─────┼──────────────────────────────────────────
1 │ 1 A 1 B 7
2 │ 2 A 3 B 9
3 │ 3 A 5 missing missing