Hello,
I need to compute a group mean using frequency weights and I do not know how to do it.
I know how to compute an un weighted mean:
using DataFrames
df = DataFrame(age = rand(25:30, 25), value = rand(25), fweight = rand(100_000:1_000_000, 25))
unweighted = combine(groupby(df, :age), :value => mean)
How can I do the same but indicating I want the mean to use the frequency weights in :fweight
?
Thanks!
Maybe something like this?
combine(groupby(df, :age), [:value, :fweight] => ((x, y) -> sum(x .* y)/sum(y)) => :weight_mean)
You can also use the weighted mean defined in StatsBase:
combine(groupby(df, :age), [:value, :fweight] => ((x, y) -> mean(x, fweights(y))) => :weighted_mean)
If you use these weights repeatedly, you can also do df.fweight = fweights(df.fweights)
once and then write y
instead of fweights(y)
.
See https://juliastats.org/StatsBase.jl/latest/means/#Statistics.mean for details.
4 Likes