Alternatives to Query.jl for aggregation

Hi everybody,

I am looking for a way to aggregate my data like Query.jl. Usually, I would run something like:

test_df = DataFrame(A = [1,1,2,2], B = [missing, 1, missing, 2])

df_agg = test_df |>
       @groupby(_.A) |>
       @map({key = key(_),
           B = first(_.B),
           B_last = last(_.B),
           B_mean = mean(skipmissing(_.B))}) |> DataFrame

But since Query.jl does not support missing values, mean(skipmissing(…)) will return always missing, as long there are some missing values in a specific period. I tripped over this one too many times with no obvious solution within Query.jl itself.

Of course, I can generate the mean myself and then do first(_.B_mean), but I wanted to know whether there is a viable alternative.

Thank you very much!

regular DataFrames.jl will work

combine(groupby(test_df,"A"), "B" .=> [first,last,mean ∘ skipmissing])
2 Likes