Hello,
I am quite new to Julia and I was wondering how to implement a rowmean using a dataframe type. For Arrays, this is quite easy but I can not seem to figure out how to do it with a dataframe because most transformations work with the columns instead of the rows. I tried to transpose the data using the permutedims function, but I got an error
test = SP500Return[week2:end,groupedbeta[1].name]
permutedims(test, 1)
ArgumentError: src_namescol must have eltype `Symbol` or `<:AbstractString`
I have 95 columns which I have grouped in 10 portfolio’s based on the beta’s of the stocks (each column is a stock in my portfolio), so renaming them becomes quite annoying. Does somebody has an idea on how to solve this issue.
I also tried:
combine(SP500Return[week2:end,groupedbeta[1].name], groupedbeta[1].name .=> ByRow(mean))
But then he returns the DataFrame back.
Thank you in advance
Sorry I’m a bit confused as for what you are asking.
Do you mean like Stata’s rowmean
egen x = rowmean(`vars')
There is no transpose for DataFrames.
I think you want AsTable
julia> df = DataFrame(rand(1000, 100), :auto);
julia> transform(df, AsTable(Between(:x50, :x100)) => ByRow(mean) => :mean_50_100)
Yea the standard way to do this is
transform!(df,["col1","col2","col3"] => ByRow(mean) => "meancol")
where you just need to update the strings to correspond to your list of columns and the name of your new column. The AsTable
in the above answer is a handy shortcut instead of listing the column names 1 by 1.
This will fail
julia> mean(1, 2, 3)
ERROR: MethodError: no method matching mean(::Int64, ::Int64, ::Int64)
You need the AsTable
so that the input is a NamedTuple
.
1 Like
You’re right! I guess that’s just what I want to work. I think I made that mistake just a few days ago too…
EDIT: It is just a failing of the mean
method though, not of the approach here.
1 Like
Thank you for your responses!
For posterity I will add the fix to my earlier solution. Annoying that this is needed, but you can make it a tuple before passing to mean.
transform!(df,["col1","col2","col3"] => ByRow(mean) ∘ ByRow(tuple) => "meancol")
1 Like