# Multiplying columns in DataFrames

How can I multiply values in one column by other columns containing numerical data? Suppose a dataframe `df_1` is given by

``````df_1 = DataFrame(NAME = ["Aa","Bb","AB","CD","EF","GH"], SCORE=[0,1,10,20,30,45], VALUE =[400,200,400,600,10,20],
INCOME = [123,56,90,1000,50,60], GAIN =[55,65,111,65,34,100])
``````

I would like to produce equivalent to `df_2` that multiplies values in the column `SCORE` with the remaining columns

``````df_2 =DataFrame(NAME = ["Aa","Bb","AB","CD","EF","GH"], SCORE=[0,1,10,20,30,45],VALUE=df_1.SCORE.*df_1.VALUE,
INCOME = df_1.SCORE.*df_1.INCOME, GAIN =df_1.SCORE.*df_1.GAIN)
``````

I have tried using this, but it doesnβt work.

``````df_2 = names(df_1,Not(Between(:NAME,:SCORE))  .=>*:SCORE)
``````

Your last attempt is quite confusing. Did you mean to have a `transform` call in there? Ideally it would be obvious why that didnβt work, but if you explain your logic for how you arrived at that solution we could better point you in the right direction.

That said, this one is very tricky! Here is a solution

``````julia> nms = names(df_1, Between(:VALUE, :GAIN))
3-element Array{String,1}:
"VALUE"
"INCOME"
"GAIN"

julia> transform(df_1, vcat.("SCORE", nms) .=> ByRow(*) .=> nms)
6Γ5 DataFrame
Row β NAME    SCORE  VALUE  INCOME  GAIN
β String  Int64  Int64  Int64   Int64
ββββββΌβββββββββββββββββββββββββββββββββββββ
1 β Aa          0      0       0      0
2 β Bb          1    200      56     65
3 β AB         10   4000     900   1110
4 β CD         20  12000   20000   1300
5 β EF         30    300    1500   1020
6 β GH         45    900    2700   4500
``````
1 Like
``````df_2 =DataFrame(NAME = ["Aa","Bb","AB","CD","EF","GH"], SCORE=[0,1,10,20,30,45])

[df_2[!,c] = df_1.SCORE .* df_1[! ,c] for c in names(df_1)[3:end] ]
df_2
``````
1 Like

I wanted to produce exactly what you have done in your reply. My knowledge of dataframes is not good, so my attempt at it may not make any sense.

I converted the dataframe to a matrix for this operation, but then I was not sure how to change column names.

``````temp=Matrix{Int64}(select(df_1, Not(Between(:NAME,:SCORE))))
temp2=temp.*df_1.SCORE
df_t= DataFrame(temp2)
``````

Thanks

you can use `rename!(df, [:A, :B])` to change names.

Thank you

``````cols=eachcol(df_1)
map(x-> x.*=cols.SCORE, cols[3:5])
df_1=DataFrame(cols)
``````

or

``````transform(df_1, nms.=> (x-> x.*=cols.SCORE).=>nms)
``````

where nms is as defined by @pdeffebach

I continue to use this problem as a testing ground for the various forms expected by the transform function syntax.
Iβm trying to understand what is passed in the various possible ways to select the columns on which to apply the transformation and consequently how it can then be βworkedβ.

These are some of the tests made (after several failed attempts), on which maybe someone could add some explicit comments on why things go this way.

Why in the following case doesnβt the last AsTable produce new columns with the names x1, x2, β¦?

``````transform(df_1,AsTable(Between(:VALUE, :GAIN))=> (x-> DataFrame(x).*df_1.SCORE)=>AsTable)
6Γ5 DataFrame
Row β NAME    SCORE  VALUE  INCOME  GAIN
β String  Int64  Int64  Int64   Int64
ββββββΌβββββββββββββββββββββββββββββββββββββ
1 β Aa          0      0       0      0
2 β Bb          1    200      56     65
3 β AB         10   4000     900   1110
4 β CD         20  12000   20000   1300
5 β EF         30    300    1500   1020
6 β GH         45    900    2700   4500
``````

Instead in this case to get the desired names of the output columns you have to use a list of names,

``````transform(df_1,AsTable(Between(:VALUE, :GAIN))=> (x-> hcat(values(x)...).*df_1.SCORE)=>nms)
6Γ5 DataFrame
Row β NAME    SCORE  VALUE  INCOME  GAIN
β String  Int64  Int64  Int64   Int64
ββββββΌβββββββββββββββββββββββββββββββββββββ
1 β Aa          0      0       0      0
2 β Bb          1    200      56     65
3 β AB         10   4000     900   1110
4 β CD         20  12000   20000   1300
5 β EF         30    300    1500   1020
6 β GH         45    900    2700   4500
``````

Otherwise β¦

`````` transform(df_1,AsTable(Between(:VALUE, :GAIN))=> (x-> hcat(values(x)...).*df_1.SCORE)=>AsTable)
6Γ8 DataFrame
Row β NAME    SCORE  VALUE  INCOME  GAIN   x1     x2     x3
β String  Int64  Int64  Int64   Int64  Int64  Int64  Int64
ββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β Aa          0    400     123     55      0      0      0
2 β Bb          1    200      56     65    200     56     65
3 β AB         10    400      90    111   4000    900   1110
4 β CD         20    600    1000     65  12000  20000   1300
5 β EF         30     10      50     34    300   1500   1020
6 β GH         45     20      60    100    900   2700   4500
``````

I finally learned that using AsTable as a selector wrapper a namedtuple is passed.
In the first case I tried the DataFrame function on the namedtuple to be able to do. * Df_1.SCORE and I saw that it works.

In the second case, I had to use ββ¦β and the hcat function. I had tried to do Matrix (values (x)) but it didnβt work. Is there a more direct way to have a table / matrix on which to apply brodacast .* Df_1.SCORE?
I know this is not the best way to solve the problem, but I just want to understand the various possibilities.

Why in the following case doesnβt the last AsTable produce new columns with the names x1, x2, β¦?

Because a `DataFrame` has names associated with it, so the new columns take on those names.

I finally learned that using AsTable as a selector wrapper a namedtuple is passed.

This shouldnβt be a surprise, itβs in the documentation!

``````help?> AsTable
search: AsTable

AsTable(cols)

A type used for selection operations to signal that the columns selected by
the wrapped selector should be passed as a NamedTuple to the function.
``````

transform(df_1,AsTable(Between(:VALUE, :GAIN))=> (x-> hcat(values(x)β¦).*df_1.SCORE)=>nms)

The function here returns a matrix, which has no names associated with it, so you get `x1, x2, ...`

You can use `Tables.matrix((a = [1, 2, 3], b = [4, 5, 6]))`

1 Like

``````transform(df_1,AsTable(Between(:VALUE, :GAIN))=> (x-> x*df_1.SCORE)=>AsTable)