If you’re interested in using a macro convenience package, https://github.com/jkrumbiegel/DataFrameMacros.jl makes such multi-column operations simpler to write.
Every symbol like :id
or expression within { }
is understood as one or many columns. The whole function expression is then broadcast over all of these collections of columns. In your case that could be one of the following options:
using DataFrameMacros
using DataFrames
df = DataFrame(id=1:3, A=11:13, B=101:103, C = 25:27, D = 32:34)
gdf = groupby(df, :id)
@transform!(gdf, :id + {[:A, :B, :C, :D]})
@transform!(gdf, :id + {Between(:A, :D)})
@transform!(gdf, :id + {Not(:id)})
@transform!(gdf, :id + {r"[ABCD]"})
The nice thing is that DataFrameMacros handles the conversion to the same vectors of strings that @sijo was constructing manually via vcat.(
for you. You can use every format that you’d normally use as selector
in names(df, selector)
.
For a little more convenience, you can quickly name your new columns with a shortcut syntax:
Compare
julia> @transform!(gdf, :id + {[:A, :B, :C, :D]})
3×9 DataFrame
Row │ id A B C D id_A_+ id_B_+ id_C_+ id_D_+
│ Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64
─────┼───────────────────────────────────────────────────────────────────
1 │ 1 11 101 25 32 12 102 26 33
2 │ 2 12 102 26 33 14 104 28 35
3 │ 3 13 103 27 34 16 106 30 37
with
julia> @transform!(gdf, "id_plus_{2}" = :id + {Between(:A, :D)})
3×9 DataFrame
Row │ id A B C D id_plus_A id_plus_B id_plus_C id_plus_D
│ Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64
─────┼───────────────────────────────────────────────────────────────────────────────
1 │ 1 11 101 25 32 12 102 26 33
2 │ 2 12 102 26 33 14 104 28 35
3 │ 3 13 103 27 34 16 106 30 37