[ANN] FilteredGroupbyMacro.jl

jules · February 9, 2020, 12:18pm

This is a really simple package (my first macro package) that does only one thing. It exports the macro @by which gives you the combination of filtering and split-apply-combine approach of R’s data.table (or something approximating that at least), with DataFrames’ function by doing the work behind the scenes.

I’ve missed having this concise syntax for a while, as it ticks the most common boxes for what I do with DataFrames and needs no redundant column or dataframe variable names, or complex signatures like newcol = (:columnA, :columnB) => x -> x.columnA .+ x.columnB.

The package is waiting for inclusion in the GeneralRegistry right now, so until then you need to install it via:

]add https://github.com/jkrumbiegel/FilteredGroupbyMacro.jl

Here are the docs: Home · FilteredGroupbyMacro.jl

But here’s also already a short example from the README:

using RDatasets
using FilteredGroupbyMacro
using StatsBase

diamonds = dataset("ggplot2", "diamonds")

# filter by Price and Carat
# then group by Cut
# finally compute new columns with keyword names

@by diamonds[(:Price .> 3000) .& (:Carat .> 0.3), :Cut,
    MeanPricePerCarat = mean(:Price) / mean(:Carat),
    MostFreqColor = sort(collect(countmap(:Color)), by = last)[end][1]]

Compare this to the default DataFrames syntax:

by(diamonds[(diamonds.Price .> 3000) .& (diamonds.Carat .> 0.3), :], :Cut,
    MeanPricePerCarat = (:Price, :Carat) => x -> mean(x.Price) / mean(x.Carat),
    MostFreqColor = :Color => x -> sort(collect(countmap(x)), by = last)[end][1])

You can also use := assignment syntax to join the groupby result with the filtered table:

using FilteredGroupbyMacro
using DataFrames

df = DataFrame(a = repeat(1:3, 3), b = repeat('a':'c', 3))

# the result of this will be df with a new column sum_a
# that contains the same sum_a for every row in each group based on :b
@by df[!, :b, sum_a := sum(:a)]

Topic		Replies	Views
Group DataFrames by a function of a column Data package	4	1206	December 11, 2019
Filter doesn't work on grouped dataframe General Usage dataframes	5	1558	February 4, 2022
A quick proof-of-concept for a macro-less API for DataFrames that's easier to type New to Julia	1	452	August 29, 2020
Data Cleaning: Split, Combine, Apply? New to Julia dataframes	9	787	October 28, 2021
Groupby generic table-shaped things General Usage question , package , tables , splitapplycombine	5	115	July 29, 2025

[ANN] FilteredGroupbyMacro.jl

Related topics