I’m working with dplyr
today and ran into something that annoys me often.
If I have a number of operations to do in repeatedly on a dataframe, its useful to put them in a function. However because of R’s lazy evaluation, using dplyr
operations in a function requires the use of quo
and !!
operations that honestly are a big pain to deal with.
R puts you in the scenario where you either have to use quo
and !!
all the time or revert to the less-than-desirable base df$
syntax if you want to put things in functions.
What I want is to be able to use functions, but also never have to leave the nice “lazy” way of doing things. So I tried out DataFramesMeta
to see if Julia is better than R at this.
function new100(df, x)
transform(df, x = 100)
end
df = DataFrame(rand(5,5))
new100(df, :y)
# makes a new column :x
│ Row │ x1 │ x2 │ x3 │ x4 │ x5 │ x │
├─────┼──────────┼──────────┼────────────┼───────────┼───────────┼─────┤
│ 1 │ 0.804735 │ 0.864846 │ 0.575764 │ 0.790783 │ 0.671446 │ 100 │
│ 2 │ 0.462695 │ 0.83796 │ 0.00718302 │ 0.566886 │ 0.36546 │ 100 │
│ 3 │ 0.905707 │ 0.387376 │ 0.550329 │ 0.0816239 │ 0.422751 │ 100 │
│ 4 │ 0.251054 │ 0.109737 │ 0.828423 │ 0.382226 │ 0.474074 │ 100 │
│ 5 │ 0.136172 │ 0.326156 │ 0.826901 │ 0.137245 │ 0.0502391 │ 100 │
I understand that the code for DataFramesMeta is already pretty difficult, with a lot of complicated parsing and macro code. Is what I want possible to do now with DataFramesMeta and Lazy? Is it a feasible feature to add?
Edit: I remembered that I posted the exact same thing earlier Replicating a useful Stata Workflow with DataFramesMeta
Keeping it for posterity and searchability.
The takeaway is that this would require a PR to work.