Frequency Weights in DataFrames

Hello,

I need to compute a group mean using frequency weights and I do not know how to do it.

I know how to compute an un weighted mean:

using DataFrames

df = DataFrame(age = rand(25:30, 25), value = rand(25), fweight = rand(100_000:1_000_000, 25))

unweighted = combine(groupby(df, :age), :value => mean)

How can I do the same but indicating I want the mean to use the frequency weights in :fweight?

Thanks!

Maybe something like this?

combine(groupby(df, :age), [:value, :fweight] => ((x, y) -> sum(x .* y)/sum(y)) => :weight_mean)

You can also use the weighted mean defined in StatsBase:

combine(groupby(df, :age), [:value, :fweight] => ((x, y) -> mean(x, fweights(y))) => :weighted_mean)

If you use these weights repeatedly, you can also do df.fweight = fweights(df.fweights) once and then write y instead of fweights(y).

See https://juliastats.org/StatsBase.jl/latest/means/#Statistics.mean for details.

4 Likes