Dynamic selection of columns Query.jl

I spent some time in this rabbit hole as well, and am closer to an answer, in the negative:

As of 2017, there was a technical constraint preventing the implementation of this. See the discussion here.

I am not aware of any updates but would love to see them if there are any.

The cleanest way I currently know of is using bkamins’ answer up front and using Query.jl after that

using DataFrames
using Statistics
using Query
using RDatasets

df = dataset("datasets", "mtcars");

targets = [:MPG, :Cyl, :HP]

# list selection up front
df[:, targets] |>
  @filter(_.MPG > 15) |>
  @groupby(_.Cyl) |>
  @map({Cyl = key(_), AvgHP = mean(_.HP)})

As mentioned in the linked discussion, it might require doing things in two stages sometimes (ie, if the column list is a result of computation half-way down the pipe) but it’s totally workable.