Groupby, aggregate with unstack on multiple columns

I do have a solution! I think it’s easiest if you unstack at the very beginning. One of the benefits of DataFrames is that you don’t have to always work in the “tidy data” paradigm because its very easy to work with many columns.

julia> data_wide = unstack(data, "Metric", "Performance");

julia> combine(groupby(data_wide, [:Subject]), [:Action, :Client, :Product] .=> diff)
10×4 DataFrame
│ Row │ Subject │ Action_diff │ Client_diff │ Product_diff │
│     │ Int64   │ Int64       │ Int64       │ Int64        │
├─────┼─────────┼─────────────┼─────────────┼──────────────┤
│ 1   │ 1       │ -17         │ -18         │ -5           │
│ 2   │ 2       │ -12         │ 1           │ 6            │
│ 3   │ 3       │ -8          │ -12         │ -4           │
│ 4   │ 4       │ -15         │ -14         │ -6           │
│ 5   │ 5       │ -8          │ -1          │ 1            │
│ 6   │ 6       │ -17         │ -6          │ 0            │
│ 7   │ 7       │ -8          │ -18         │ -7           │
│ 8   │ 8       │ -4          │ -7          │ -4           │
│ 9   │ 9       │ -8          │ -8          │ 5            │
│ 10  │ 10      │ -10         │ -13         │ -6           │

I’m not 100% sure what you want in the length(within) == 1 case, but hopefully the technique is similar.

4 Likes