What is the best way to do a “normalized” nrow in a gdf?
using DataFrames
using Chain
df = DataFrame(id=1:6,
name=["Aaron Aardvark", "Belen Barboza",
"春 陈", "Даниил Дубов",
"Elżbieta Elbląg", "Felipe Fittipaldi"],
age=[50, 45, 40, 35, 30, 25],
eye=["blue", "brown", "hazel", "blue", "green", "brown"],
grade_1=[95, 90, 85, 90, 95, 90],
grade_2=[75, 80, 65, 90, 75, 95],
grade_3=[85, 85, 90, 85, 80, 85])
@chain df begin
groupby(:eye)
combine(nrow => :n, x -> nrow(x) / nrow(df))
end
That outputs:
4×3 DataFrame
Row │ eye n x1
│ String Int64 Float64
─────┼─────────────────────────
1 │ blue 2 0.333333
2 │ brown 2 0.333333
3 │ hazel 1 0.166667
4 │ green 1 0.166667
But it i try to rename the x1 column I get a strange thing:
@chain df begin
groupby(:eye)
combine(nrow => :n, x -> nrow(x) / nrow(df) => :perc)
end
4×3 DataFrame
Row │ eye n x1
│ String Int64 Pair…
─────┼────────────────────────────────
1 │ blue 2 0.333333=>:perc
2 │ brown 2 0.333333=>:perc
3 │ hazel 1 0.166667=>:perc
4 │ green 1 0.166667=>:perc