After removing some rows from a dataframe, I would like to keep only groups (grouping on a variable) which have “complete” observations (households always have 2 members in this data in the dataset I start with, so for any fewer I can consider that household incomplete).
The following MWE works, is there a more idiomatic way?
Not sure if it’s idiomatic, but sometimes I think it’s helpful to add the count as an extra column. You can do so by calling transform! to the grouped data. The parent, ungrouped, dataframe is updated in-place.
Thanks for all the answers. “Computing in the table”, which is commonly used in eg Stata, is a style I would particularly like to avoid because I find that it is frequently a source of bugs. I am very happy that DataFrames supports a functional style, and I was just wondering if I am doing the right thing.