# Normalized nrow in a GroupedDataFrame

What is the best way to do a “normalized” nrow in a gdf?

``````using DataFrames
using Chain
df = DataFrame(id=1:6,
name=["Aaron Aardvark", "Belen Barboza",
"春 陈", "Даниил Дубов",
"Elżbieta Elbląg", "Felipe Fittipaldi"],
age=[50, 45, 40, 35, 30, 25],
eye=["blue", "brown", "hazel", "blue", "green", "brown"],
grade_1=[95, 90, 85, 90, 95, 90],
grade_2=[75, 80, 65, 90, 75, 95],
grade_3=[85, 85, 90, 85, 80, 85])
@chain df begin
groupby(:eye)
combine(nrow => :n,  x -> nrow(x) / nrow(df))
end
``````

That outputs:

``````4×3 DataFrame
Row │ eye     n      x1
│ String  Int64  Float64
─────┼─────────────────────────
1 │ blue        2  0.333333
2 │ brown       2  0.333333
3 │ hazel       1  0.166667
4 │ green       1  0.166667
``````

But it i try to rename the x1 column I get a strange thing:

`````` @chain df begin
groupby(:eye)
combine(nrow => :n,  x -> nrow(x) / nrow(df) => :perc)
end
4×3 DataFrame
Row │ eye     n      x1
│ String  Int64  Pair…
─────┼────────────────────────────────
1 │ blue        2  0.333333=>:perc
2 │ brown       2  0.333333=>:perc
3 │ hazel       1  0.166667=>:perc
4 │ green       1  0.166667=>:perc
``````

That’s operatory precedence for you:

``````julia> @chain df begin
groupby(:eye)
combine(nrow => :n,  :name => (x -> length(x) / nrow(df)) => :perc)
end
4×3 DataFrame
Row │ eye     n      perc
│ String  Int64  Float64
─────┼─────────────────────────
1 │ blue        2  0.333333
2 │ brown       2  0.333333
3 │ hazel       1  0.166667
4 │ green       1  0.166667
``````

(Note the brackets around the anonymous function)

2 Likes

Thank you!

Why does the first case work? I don’t see anything in the docs for passing just a function like that, only `cols => function` or `cols => function => newcols`. And why is that column called `x1`?

``````julia> @chain df begin
groupby(:eye)
combine(y -> 1)
end
4×2 DataFrame
Row │ eye     x1
│ String  Int64
─────┼───────────────
1 │ blue        1
2 │ brown       1
3 │ hazel       1
4 │ green       1

julia> @chain df begin
groupby(:eye)
combine(y -> 1, x -> 2)
end
ERROR: ArgumentError: duplicate output column name: :x1
``````

It’s list item 7 here. You can pass a function which accepts a `SubDataFrame`. But I guess it doesn’t generate names perfectly so you get an error where it tries to make `:x1` twice.