# Row-wise mean of columns in a DataFrame

I saw a post where @bkamins explained how to add a column that has the sum of the rows in selected columns ( (Julia) Assigning DataFrame column sum to a new column - Stack Overflow). Is there a way to use `transform` to do the same, but with the `mean` operation?

I know I can do something like:

``````df.mean = mean(Array(df), dims=2)
``````

Another way with `transform` is:

``````transform(df, names(df) => ByRow((i...) -> mean(i)))
``````

Is there a cleaner way to do this with `transform`?

Definitely don’t do this. It will allocate tons of memory.

You want

``````transform(df, AsTable(:) => ByRow(mean) => :rowmean)
``````

BTW, are you coming from Stata by chance? This is a common operation in Stata.

Also, the above will run into trouble if you have many many columns because it constructs a named tuple. Making it better has been the object of extensive discussion. See here.

Awesome, thanks!

Not from Stata, just trying to plot the timeseries output of multiple stochastic simulations (mean +/- 1 std)

In this case, broadcasting seems to be faster & allocate less than using transform?

``````using DataFrames, BenchmarkTools
df = DataFrame(rand(1000,1000), :auto)
dg = deepcopy(df)
@btime transform(\$df, AsTable(:) => ByRow(mean) => :rowmean)  # 322.4 ms (1015669 allocations: 77.59 MiB)
@btime dg.rowmean .= mean(Array(\$dg), dims=2)    # 1.887 ms (1020 allocations: 7.67 MiB)
``````
Yeah, `1000` is definitely large enough for `AsTable` to cause problems.

``````reduce(+, eachcol(df)) ./ ncol(df)
``````

should be faster than both.

2 Likes