Using DataFramesMeta (or others), can I select multiple columns of choice and do some calculations?
Here’s a mock data. I would like to subset by column x, and then calculate column-wise means. Here’s the answer.
df = DataFrame(x = [1,1,1,2,3,2,3,2,3,], y1 = [2,1,2,1,2,1,2,1,2], y2 = [5,7,6,5,7,6,5,7,6], y3 = [4,2,3,4,2,3,4,2,3])
l = [:y1, :y2, :y3]
by(df, :x, d->mean(Array{Float64,2}(d[l]); dims=1))
3×4 DataFrame
│ Row │ x │ x1 │ x2 │ x3 │
├──┼─┼────┼───┼──┤
│ 1 │ 1 │ 1.66667 │ 6.0 │ 3.0 │
│ 2 │ 2 │ 1.0 │ 6.0 │ 3.0 │
│ 3 │ 3 │ 2.0 │ 6.0 │ 3.0 │
I also wanted to do this using @linq so that it can be coupled with other functions. But, mine didn’t work (obviously).
@linq df |> by(:x, mean(Array{Float64,2}(l; dims=2)))
I wanted to,
- subset a data frame by column x
- select columns of choice (y1, y2, y3)
- calculate column-wise means for y1, y2, y3
Any idea?
Thanks,