Groupby, aggregate with unstack on multiple columns

pdeffebach · October 11, 2020, 3:47pm

I do have a solution! I think it’s easiest if you unstack at the very beginning. One of the benefits of DataFrames is that you don’t have to always work in the “tidy data” paradigm because its very easy to work with many columns.

julia> data_wide = unstack(data, "Metric", "Performance");

julia> combine(groupby(data_wide, [:Subject]), [:Action, :Client, :Product] .=> diff)
10×4 DataFrame
│ Row │ Subject │ Action_diff │ Client_diff │ Product_diff │
│     │ Int64   │ Int64       │ Int64       │ Int64        │
├─────┼─────────┼─────────────┼─────────────┼──────────────┤
│ 1   │ 1       │ -17         │ -18         │ -5           │
│ 2   │ 2       │ -12         │ 1           │ 6            │
│ 3   │ 3       │ -8          │ -12         │ -4           │
│ 4   │ 4       │ -15         │ -14         │ -6           │
│ 5   │ 5       │ -8          │ -1          │ 1            │
│ 6   │ 6       │ -17         │ -6          │ 0            │
│ 7   │ 7       │ -8          │ -18         │ -7           │
│ 8   │ 8       │ -4          │ -7          │ -4           │
│ 9   │ 9       │ -8          │ -8          │ 5            │
│ 10  │ 10      │ -10         │ -13         │ -6           │

I’m not 100% sure what you want in the length(within) == 1 case, but hopefully the technique is similar.

Topic		Replies	Views
Combining a col from each DF group into a single DF New to Julia question , dataframes	5	296	August 25, 2022
Dataframes: Split combined result to different columns General Usage dataframes	3	310	December 13, 2021
Translation groupby and agg and join python to julia General Usage	4	934	April 8, 2021
Comparing DataFrames native API and Query Data	4	1525	September 1, 2017
JuliaDB groupby many columns General Usage juliadb	0	361	April 9, 2021

Groupby, aggregate with unstack on multiple columns

Related topics