Hi all, due to the lack of copycalls kwarg in combine, I am not sure if there is a more efficient way to achieve what I want to do.
The following code groups the example data frame by ID and then sums the values in DataB column together, and should return a single row per group, but with all columns preserved. Is there a better way to do this?
using DataFrames
mydf = DataFrame(
ID=[:A,:A,:B,:C],
DataA=[10,10,5,7],
DataB=[12,63,4,5]
)
vcat([
let
unique(transform(df, :DataB => sum => :DataB, copycols=true))
end
for df in groupby(mydf, :ID)
]...)
Thank you! Is there some way to concisely specify :DataA => unique => :DataA, etc, for all the other columns? In my actual problem I have around 40 columns, which would be tedious to write out and modify if they change.
combine(groupby(mydf, :ID), 1:ncol(mydf) .=> maximum, :DataB => sum)
3Γ5 DataFrame
Row β ID ID_maximum DataA_maximum DataB_maximum DataB_sum
β Symbol Symbol Int64 Int64 Int64
ββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β A A 10 63 75
2 β B B 5 4 4
3 β C C 7 5 5
As for @Rudi79βs suggestion, this requires all the values of each column to be constant for each valie of ID.
combine(groupby(mydf, propertynames(mydf)[1:ncol(mydf)-1]), :DataB => sum => :DataB)
3Γ3 DataFrame
Row β ID DataA DataB
β Symbol Int64 Int64
ββββββΌββββββββββββββββββββββ
1 β A 10 75
2 β B 5 4
3 β C 7 5
Saves typing column names, but you need the βconstant variablesβ all together in a block of columns, and the ones you want to apply the function to in a separate block.