Grouping a DataFrame by something other than an existing column

Seth_Chandler · August 6, 2017, 12:50am

Is it possible to group a DataFrame by something other than an existing column? Yes, I could add a column to the DataFrame that contains the computed values and then group on that new column. But this may clutter the DataFrame with columns that I use only once. Here’s some pseudo-syntax for what I would like to do if I were using the RDataset iris dataset and wanted to compute the mean petal width for irises depending on whether their sepal length was greater than 5.

 by(iris,iris[:SepalLength].>5.,df->mean(df[:PetalWidth]))

I have the sense that there has to be easy way to do this and I am just missing something.

Obviously I could do this

 iris[big_length]=iris[:SepalLength].>5.
 by(iris,:big_length,df->mean(df[:PetalWidth]))

But I am trying to avoid adding a new column to iris.

davidanthoff · August 6, 2017, 4:20am

You can do this with this Query.jl query:

df = @from i in iris begin
    @group i.PetalWidth by i.SepalLength > 5 into g
    @select {big_length = g.key, PetalWidth = mean(g)}
    @collect DataFrame
end

bramtayl · August 6, 2017, 1:49pm

Edit: nevermind, a new variable is indeed required…

Or LazyQuery:

using RDatasets
using LazyQuery

@new_environment

@chain @evaluate begin
    using LazyQuery
    using RDatasets

    dataset("datasets", "iris")
    @add_to it BigLength = ~SepalLength > 5
    @group it by BigLength
    @make_from it PetalWidth = mean(PetalWidth)
    @ungroup it
end

Topic		Replies	Views
Group DataFrames by a function of a column Data package	4	1206	December 11, 2019
Groupby on an expression or a vector? New to Julia	21	574	June 11, 2024
Create grouped dataframe by properties of a given column? New to Julia dataframes , grouped-data	9	393	April 26, 2024
Comparing DataFrames native API and Query Data	4	1525	September 1, 2017
Query.jl: how to filter on derived quantities of a group? Data	3	484	April 24, 2019

Grouping a DataFrame by something other than an existing column

Related topics