I’m looking for a Julian way of selecting a subset of each group. I have a DataFrame with (among others) two columns, say name and length. I want to group over all names and pick the 5 tallest people within each name. I tried this, but it does not return the correct result:
df = ...
sort!(df, [:length])
df2 = df |> @groupby([:name]) |> @take(5) |> collect
print(DataFrame(df2))
Changing collect to DataFrame does not work either. The print will tell me I have a dataframe with as many rows as the initial dataframe df. This sort of thing; taking a df → grouping it → selecting a subset of the rows of the groups → recombining the selected rows into a dataframe, is something I would assume is a common thing to do.