How to add rows in a DataFrame

In a dataframe, how do we add rows containing numeric data where a column containing the NAME is same?

df1 = DataFrame(NAME = ["PP","PC","PP","AB","AF","AF"], VALUE =[100,200,300,400,500,100],
INCOME = [23,56,100,90,900,100])

From df1, I want to get to df2 with rows of VALUE and INCOME added together when NAME is same.

df2= DataFrame(NAME = ["PP","PC","AB","AF"], VALUE =[400,200,400,600],
INCOME = [123,56,90,1000])

Then I want to create another dataframe df3 that has same number of rows as df1 but updated numerical data from df2. A bit hacky solution for this will be

df3= rightjoin(df2,df1,on =:NAME,makeunique=true)
df3= select!(df3,Not(:VALUE_1))
df3= select!(df3,Not(:INCOME_1))

You want combine i.e.

combine(groupby(df1, :NAME), :VALUE => sum, :INCOME => sum)

Your plan for updating the values with a rightjoin is fine and what I would do.

But you can do all this in one step via transform.

transform(groupby(df1, :NAME), :VALUE => sum, :INCOME => sum)
1 Like

Thanks

This requires specifying names of each column I want to add, so could be difficult to use for a dataframe with many columns. Do Not and All function work with combine to select a range of column for sum ? I did try using them but I get MethodError

yeah, broadcasting doesn’t yet work for those. you want names(df2, Not(...)) .=> sum

Thanks