Sum the columns in a dataframe

This should be a relatively simple question but I could not find a great answer. I have a dataframe like below ( but with way more colums), and I wish to add a new column that is the sum of all the other columns (e.g. a + b + c). We can assume that all of the columns are float64 types. I would like to do this over a large group of columns so specifying each column one by one is not feasible. Many thanks!

df = DataFrame(a = repeat([1, 2, 3, 4], outer=[2]),

                      b = repeat([2, 1], outer=[4]),

                                     c = randn(8))

Not at a computer but I think something like sum(eachcol(df)) should work

That seems to work:

julia> using DataFrames

julia> df = DataFrame(a = repeat([1, 2, 3, 4], outer=[2]),

                             b = repeat([2, 1], outer=[4]),

                                            c = randn(8))
8Γ—3 DataFrame
β”‚ Row β”‚ a     β”‚ b     β”‚ c         β”‚
β”‚     β”‚ Int64 β”‚ Int64 β”‚ Float64   β”‚
β”œβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1   β”‚ 1     β”‚ 2     β”‚ 2.33499   β”‚
β”‚ 2   β”‚ 2     β”‚ 1     β”‚ 1.08153   β”‚
β”‚ 3   β”‚ 3     β”‚ 2     β”‚ 1.55002   β”‚
β”‚ 4   β”‚ 4     β”‚ 1     β”‚ -1.35953  β”‚
β”‚ 5   β”‚ 1     β”‚ 2     β”‚ -1.87585  β”‚
β”‚ 6   β”‚ 2     β”‚ 1     β”‚ 1.06405   β”‚
β”‚ 7   β”‚ 3     β”‚ 2     β”‚ -0.129446 β”‚
β”‚ 8   β”‚ 4     β”‚ 1     β”‚ 1.98992   β”‚

julia> sum(eachcol(df))
8-element Array{Float64,1}:
 5.334989587238981
 4.081530272951787
 6.550024757198643
 3.640465366532979
 1.1241529384736229
 4.064050523357361
 4.8705544804273035
 6.989923186107735

Are you coming from Stata, by chance? You can emulate rowtotal using ByRow in transform in DataFrames.

I am coming from an R background…

Thank you the eachcol solution works. I tried transform, but probably wansnt getting the syntax right.

Doesn’t this sum the rows? Sum of the column sum(eachrow(df)) doesnt work for me. Why not?

sum(eachcol(df)) does indeed sum across, since it essentially does sum([df[!, c] for c in names(df)]). If you want to sum down, you should use sum.(eachcol(df)), which is essentially [sum(df[!, c]) for c in names(df)].