# Sum the columns in a dataframe

This should be a relatively simple question but I could not find a great answer. I have a dataframe like below ( but with way more colums), and I wish to add a new column that is the sum of all the other columns (e.g. a + b + c). We can assume that all of the columns are float64 types. I would like to do this over a large group of columns so specifying each column one by one is not feasible. Many thanks!

``````df = DataFrame(a = repeat([1, 2, 3, 4], outer=),

b = repeat([2, 1], outer=),

c = randn(8))
``````

Not at a computer but I think something like `sum(eachcol(df))` should work

2 Likes

That seems to work:

``````julia> using DataFrames

julia> df = DataFrame(a = repeat([1, 2, 3, 4], outer=),

b = repeat([2, 1], outer=),

c = randn(8))
8×3 DataFrame
│ Row │ a     │ b     │ c         │
│     │ Int64 │ Int64 │ Float64   │
├─────┼───────┼───────┼───────────┤
│ 1   │ 1     │ 2     │ 2.33499   │
│ 2   │ 2     │ 1     │ 1.08153   │
│ 3   │ 3     │ 2     │ 1.55002   │
│ 4   │ 4     │ 1     │ -1.35953  │
│ 5   │ 1     │ 2     │ -1.87585  │
│ 6   │ 2     │ 1     │ 1.06405   │
│ 7   │ 3     │ 2     │ -0.129446 │
│ 8   │ 4     │ 1     │ 1.98992   │

julia> sum(eachcol(df))
8-element Array{Float64,1}:
5.334989587238981
4.081530272951787
6.550024757198643
3.640465366532979
1.1241529384736229
4.064050523357361
4.8705544804273035
6.989923186107735
``````
2 Likes

Are you coming from Stata, by chance? You can emulate `rowtotal` using `ByRow` in `transform` in DataFrames.

1 Like

I am coming from an R background…

Thank you the eachcol solution works. I tried transform, but probably wansnt getting the syntax right.

Doesn’t this sum the rows? Sum of the column sum(eachrow(df)) doesnt work for me. Why not?

1 Like

`sum(eachcol(df))` does indeed sum across, since it essentially does `sum([df[!, c] for c in names(df)])`. If you want to sum down, you should use `sum.(eachcol(df))`, which is essentially `[sum(df[!, c]) for c in names(df)]`.

1 Like