# Dataframe transform operation on multiple columns

Hello,

I am trying to use `transform` on multiple columns of a dataframe, that is I want apply an operation taking 2 variables as inputs, column1 and column2 and output the result.

A simple example would be:

``````df =  DataFrame(A = 1:4, B = 1:4, C = 5:8);
transform!(df, :, [:B,:C] => (x,y) -> sumcols(x,y))
``````

where `sumcols` just sums columns together, should give me a column that looks like:

[6;8;10;12];

Of course, in this case I can just do `.+`, but such a syntax would be useful in many scenarios. Any ideas how I can get this?

Thanks!

1 Like
``````transform!(df, :, [:B,:C] => ByRow(+) => :newcolname)
``````

Thank you, this works for what I ask originally. However, is there something that works more generally for any function fun(:B,:C) that outputs a vector of the appropriate size?

For example, say

``````function f(x,y)
N = size(x,1);
z = zeros(N);
for i=1:N
z[i] = maximum(x[i:end]) + minimum(y[i:end]);
end
return z
end
``````

with the output:
`[9;10;11;12];`

Thanks!

Interesting, I did not know `ByRow` I was using a little more generic function I need in other contexts too.

``````broadwrap(f) = function (args...) broadcast(f, args...) end
transform!(df, [:B,:C] => broadwrap(+) => :D)
``````

Yes, I get a similar limitation with your method as the one I mention in the previous response.

Thank you though!

In that example you could do

``````transform!(df, :, [:B,:C] => (x,y) -> f(x,y))
``````

and it will work. If the function works on vectors, you use this, if it is something you want to broadcast then you use `ByRow`.

2 Likes

In fact, just for completeness, I’ll note that you can even be lazy in your original example and just do

``````transform!(df, :, [:B,:C] => +)
``````

because `+` also works for vectors.

1 Like

I have to admit that I am not understanding what limitation you are referring to.

This is perfect! Thanks!

Yes, sorry, was not clear. Yours, at least as I applied it, was for row-by-row operations, what @tbeason has here:

``````transform!(df, :, [:B,:C] => (x,y) -> f(x,y))
``````

works for what I intended.

Thanks!

1. Yes. My `broadwrap` function was for row-by-row operations, this is because the `transform!` method already works out-of-the-box for operations directly over the vectors (instead of row-by-row).
2. You do not need to do `transform!(df, :, [:B,:C] => (x,y) -> f(x,y))`, you can just `transform!(df, [:B,:C] => f)`, `f` is already a function, and I am not sure why the colon would be needed.
1 Like