Tag each unique combination of column values in DataFrames

Great question!

I’m not sure I have a great solution, and it looks others have requested the same based on this github issue.

The proposed solution by Bogumil is groupindices)

julia> df = DataFrame(a = rand(1:5, 100), b = rand(11:15, 100));

julia> gd = groupby(df, [:a, :b]);

julia> groupindices(gd)
100-element Vector{Union{Missing, Int64}}:
  7
 17
 17
 22
 17
  7
  7
  6
 11
 11
  ⋮
 12
 10
 14
 19
  1
  9
 20
 12
 18
  7

But as you can see there is still some discussion of adding a convenience function.

2 Likes