[ANN] DataFrameMacros.jl v0.3.0

I made some breaking changes to DataFrameMacros, which hopefully make the package easier to use and more powerful.

    • Breaking: The $() interpolation syntax is replaced by {} for single columns (or broadcasted multi-columns)
    • Added {{}} for referring to multiple columns as a tuple.
    • Breaking: No more flag macros, replaced by explicit @byrow, @bycol, @passmissing, @astable which is also what DataFramesMeta uses and which I was finally convinced is better (less confusing) than the single-character macros I liked at first for their brevity.
    • Breaking: All(), Between() and Not() have to be interpolated with {} and can’t be used standalone anymore.

An example of the new {{}} syntax is below. I was annoyed for a long time that it wasn’t as easy to refer to multiple columns together in one expression to aggregate over them and compare with some other columns at the same time. I already had implicit broadcasting for {} but that only makes it easier to run the same expression on multiple columns each, not multiple columns together. The {{}} syntax is replaced with a tuple of the specified columns, so you can run aggregations on that. Because of the tuple-ization it’s probably not useful for very large numbers of columns (too much compilation overhead I assume) but for normal workloads it should be fine.

julia> df = DataFrame(
           jan = randn(5),
           feb = randn(5),
           mar = randn(5),
           apr = randn(5),
           may = randn(5),
           jun = randn(5),
           jul = randn(5),
       )
5Γ—7 DataFrame
 Row β”‚ jan        feb         mar        apr        may        jun        jul  β‹―
     β”‚ Float64    Float64     Float64    Float64    Float64    Float64    Floa β‹―
─────┼──────────────────────────────────────────────────────────────────────────
   1 β”‚  1.19027   -0.664713   -0.339366   0.368002  -0.979539   1.52392   -0.8 β‹―
   2 β”‚  2.04818    0.980968   -0.843878  -0.281133   0.260402  -1.77773    0.3
   3 β”‚  1.14265   -0.0754831  -0.888936  -0.734886  -0.468489  -2.93306   -0.1
   4 β”‚  0.459416   0.273815    0.327215  -0.71741   -0.880897   0.782258   2.3
   5 β”‚ -0.396679  -0.194229    0.592403  -0.77507    0.277726   2.31358   -0.9 β‹―
                                                                1 column omitted
julia> @select(df, :july_larger = :jul > median({{Between(:jan, :jun)}}))
5Γ—1 DataFrame
 Row β”‚ july_larger
     β”‚ Bool
─────┼─────────────
   1 β”‚       false
   2 β”‚        true
   3 β”‚        true
   4 β”‚        true
   5 β”‚       false
julia> @select(df, :mean_smaller = mean({{All()}}) < median({{All()}}))
5Γ—1 DataFrame
 Row β”‚ mean_smaller
     β”‚ Bool
─────┼──────────────
   1 β”‚        false
   2 β”‚         true
   3 β”‚         true
   4 β”‚        false
   5 β”‚        false
9 Likes