Hi,
I’m trying to determine unique counts of values in a column per group in a DataFrame.
As an example, given the following:
using DataFrames
julia> df = DataFrame(lab = [repeat(["Lab1"], 4)...; repeat(["Lab2"], 5)...], value = ['a','a','b','a','a','b','c','c','c'])
9×2 DataFrame
│ Row │ lab    │ value │
│     │ String │ Char  │
├─────┼────────┼───────┤
│ 1   │ Lab1   │ 'a'   │
│ 2   │ Lab1   │ 'a'   │
│ 3   │ Lab1   │ 'b'   │
│ 4   │ Lab1   │ 'a'   │
│ 5   │ Lab2   │ 'a'   │
│ 6   │ Lab2   │ 'b'   │
│ 7   │ Lab2   │ 'c'   │
│ 8   │ Lab2   │ 'c'   │
│ 9   │ Lab2   │ 'c'   │
I would like to get
5×3 DataFrame
│ Row │ lab    │ value │ count │
│     │ String │ Char  │ Int64 │
├─────┼────────┼───────┼───────┤
│ 1   │ Lab1   │ 'a'   │ 3     │
│ 2   │ Lab1   │ 'b'   │ 1     │
│ 3   │ Lab2   │ 'a'   │ 1     │
│ 4   │ Lab2   │ 'b'   │ 1     │
│ 5   │ Lab2   │ 'c'   │ 3     │
I’ve gotten as far as
julia> combine(grouped, :value => (vals -> keys(counter(vals))) => :value, :value => (vals -> values(counter(vals))) => :count)
2×3 DataFrame
│ Row │ lab    │ value           │ count     │
│     │ String │ Base.KeySet…    │ Base.Val… │
├─────┼────────┼─────────────────┼───────────┤
│ 1   │ Lab1   │ ['a', 'b']      │ [3, 1]    │
│ 2   │ Lab2   │ ['a', 'c', 'b'] │ [1, 3, 1] │
However, I would prefer
- not to call 
countertwice - actually split the counts into separate rows
 
Can someone help?
Thanks!
Kevin