Hi all,
I have a large DataFrame with a column of values and a unique id. I would like to create a new DataFrame of the unique values in one column, and a vector of the corresponding ids in another column. Here is an example:
DataFrame
3×2 DataFrame
Row │ id value
│ Int64 Int64
─────┼──────────────
1 │ 2 3
2 │ 4 3
3 │ 5 1
New DataFrame
Row │ id value
│ Array… Int64
─────┼───────────────
1 │ [2, 4] 3
2 │ [5] 1
One approach might be to group the DataFrame by value and extract the ids for each group using combine
. However, I am not quite sure how to do that. Any guidance would be appreciated.
MWE
using DataFrames
df = DataFrame(id = [2,4,5], value = [3,3,1])
groups = groupby(df, :value)
# how do I extract the ids?
df_new = combine(groups, )