Is it possible to group a DataFrame by something other than an existing column? Yes, I could add a column to the DataFrame that contains the computed values and then group on that new column. But this may clutter the DataFrame with columns that I use only once. Here’s some pseudo-syntax for what I would like to do if I were using the RDataset iris dataset and wanted to compute the mean petal width for irises depending on whether their sepal length was greater than 5.
by(iris,iris[:SepalLength].>5.,df->mean(df[:PetalWidth]))
I have the sense that there has to be easy way to do this and I am just missing something.
Obviously I could do this
iris[big_length]=iris[:SepalLength].>5.
by(iris,:big_length,df->mean(df[:PetalWidth]))
But I am trying to avoid adding a new column to iris.