DataFrame creating a rowmean for 95 columns

Hello,

I am quite new to Julia and I was wondering how to implement a rowmean using a dataframe type. For Arrays, this is quite easy but I can not seem to figure out how to do it with a dataframe because most transformations work with the columns instead of the rows. I tried to transpose the data using the permutedims function, but I got an error


test = SP500Return[week2:end,groupedbeta[1].name]

permutedims(test, 1)

ArgumentError: src_namescol must have eltype `Symbol` or `<:AbstractString`

I have 95 columns which I have grouped in 10 portfolio’s based on the beta’s of the stocks (each column is a stock in my portfolio), so renaming them becomes quite annoying. Does somebody has an idea on how to solve this issue.

I also tried:

combine(SP500Return[week2:end,groupedbeta[1].name], groupedbeta[1].name .=> ByRow(mean)) 

But then he returns the DataFrame back.

Thank you in advance

Sorry I’m a bit confused as for what you are asking.

Do you mean like Stata’s rowmean

egen x = rowmean(`vars')

There is no transpose for DataFrames.

I think you want AsTable

julia> df = DataFrame(rand(1000, 100), :auto);
julia> transform(df, AsTable(Between(:x50, :x100)) => ByRow(mean) => :mean_50_100)

Yea the standard way to do this is

transform!(df,["col1","col2","col3"] => ByRow(mean) => "meancol")

where you just need to update the strings to correspond to your list of columns and the name of your new column. The AsTable in the above answer is a handy shortcut instead of listing the column names 1 by 1.

This will fail

julia> mean(1, 2, 3)
ERROR: MethodError: no method matching mean(::Int64, ::Int64, ::Int64)

You need the AsTable so that the input is a NamedTuple.

1 Like

You’re right! I guess that’s just what I want to work. I think I made that mistake just a few days ago too…

EDIT: It is just a failing of the mean method though, not of the approach here.

1 Like

Thank you for your responses!

For posterity I will add the fix to my earlier solution. Annoying that this is needed, but you can make it a tuple before passing to mean.

transform!(df,["col1","col2","col3"] => ByRow(mean) ∘ ByRow(tuple) => "meancol")
1 Like