How to correct the contents of GroupedDataFrame to update it?

how to correct the contents of groupeddataframe to update it?

group = [:A, :A, :B, :B]
X = 1:4
Y = 5:8
df = DataFrame(; group, X, Y)
julia> df = DataFrame(; group, X, Y)
4×3 DataFrame
 Row │ group   X      Y
     │ Symbol  Int64  Int64
─────┼──────────────────────
   1 │ A           1      5
   2 │ A           2      6
   3 │ B           3      7
   4 │ B           4      8
julia> gdf = groupby(df, :group)
GroupedDataFrame with 2 groups based on key: group
First Group (2 rows): group = :A
 Row │ group   X      Y
     │ Symbol  Int64  Int64
─────┼──────────────────────
   1 │ A           1      5
   2 │ A           2      6
⋮
Last Group (2 rows): group = :B
 Row │ group   X      Y
     │ Symbol  Int64  Int64
─────┼──────────────────────
   1 │ B           3      7
   2 │ B           4      8

How can I change the content of gdf directly? I want to remove the First line of A of gdf’s First Group so that GroupedDataFrame becomes new? It looks like this

julia> gdf
GroupedDataFrame with 2 groups based on key: group
First Group (1 row): group = :A
 Row │ group   X      Y
     │ Symbol  Int64  Int64
─────┼──────────────────────
   1 │ A           2      6
⋮
Last Group (2 rows): group = :B
 Row │ group   X      Y
     │ Symbol  Int64  Int64
─────┼──────────────────────
   1 │ B           3      7
   2 │ B           4      8

Thanks for helping me!

It looks like you’ll have to delete the row in the original dataframe and then recreate the grouped df.

julia> deleteat!(df, findfirst(==(:A), df.group))
3×3 DataFrame
 Row │ group   X      Y     
     │ Symbol  Int64  Int64 
─────┼──────────────────────
   1 │ A           2      6
   2 │ B           3      7
   3 │ B           4      8

julia> gdf = groupby(df, :group)
GroupedDataFrame with 2 groups based on key: group

Each group in a GroupedDataFrame is a SubDataFrame, and those can’t be modified directly.

julia> deleteat!(gdf[1], 1)
ERROR: ArgumentError: SubDataFrame does not support deleting rows
1 Like

Thanks very much

You could subset the GroupedDataFrame, but then you’ll have a DataFrame:

subset(gdf, [:group, :X] => ((g, x) ->  (g .!= :A)  .||  (x .> 1)))
3×3 DataFrame
 Row │ group   X      Y
     │ Symbol  Int64  Int64
─────┼──────────────────────
   1 │ A           2      6
   2 │ B           3      7
   3 │ B           4      8

Sorry,I want to iterate over each Subdataframe, and if I do what you said, it might be difficult.

Something like this?

julia> combine(gdf) do sdf
           sdf[1, :]
       end
2×3 DataFrame
 Row │ group   X      Y     
     │ Symbol  Int64  Int64 
─────┼──────────────────────
   1 │ A           1      5
   2 │ B           3      7

EDIT: You can actually do ungoup = false to preserve the grouped-ness

julia> combine(gdf; ungroup = false) do sdf
           sdf[1, :]
       end
GroupedDataFrame with 2 groups based on key: group
First Group (1 row): group = :A
 Row │ group   X      Y     
     │ Symbol  Int64  Int64 
─────┼──────────────────────
   1 │ A           1      5
⋮
Last Group (1 row): group = :B
 Row │ group   X      Y     
     │ Symbol  Int64  Int64 
─────┼──────────────────────
   1 │ B           3      7
2 Likes

Thanks. If I want subset A’s firstline and B’s second line,and return a GroupedDataFrame,what should I do?

Hmmmm… probably an if-else statement inside the function.