# Group, Mutate, Ungroup

Is there a way developed to support the following workflow? I realize this example is trivial but it could be applied to more complex workflows like subgroup moving averages. And while there are ways to run calculations over a DataFrame in a way that does not need the following method sometimes it is easiest (mentally, temporally) to use this sort of brute force approach.

Goal: Number all rows in each group from 1:Number of Rows
Operation 1: Break DataFrame into SubGroups based on Attributes
Operation 2: Add column to each subgroup which numbers it from 1:Number of Rows
Operation 3: Ungroup SubGroups back into single DataFrame

I know Step1 is possible in Julia today. Is there a way to perform operations 2 and 3? I have tried to modify a SubDataFrame and it does not seem possible. Also I canβt find a way to ungroup a grouped dataFrame.

``````julia> df = DataFrame(x=[1, 1, 2, 2, 2, 3])
6Γ1 DataFrames.DataFrame
β Row β x β
βββββββΌββββ€
β 1   β 1 β
β 2   β 1 β
β 3   β 2 β
β 4   β 2 β
β 5   β 2 β
β 6   β 3 β

julia> by(df, :x) do sdf
DataFrame(n=1:size(sdf,1))
end
6Γ2 DataFrames.DataFrame
β Row β x β n β
βββββββΌββββΌββββ€
β 1   β 1 β 1 β
β 2   β 1 β 2 β
β 3   β 2 β 1 β
β 4   β 2 β 2 β
β 5   β 2 β 3 β
β 6   β 3 β 1 β
``````

The `by` function is just a combination of `groupy` (operation 1) and `combine` (operation 3).

Sweet! I didnβt realize byβ¦do could be used like that. I think because all of the examples did summary statistics I never thought to try it for non-summary methods.

1 Like