Hello!

What is right way to transform an dataframe in place with a function which return multiple columns?

I want to write it like

transform!(df, :A=>(x->f(x))=>[:B,:C])

but it doesnt work.

Hello!

What is right way to transform an dataframe in place with a function which return multiple columns?

I want to write it like

transform!(df, :A=>(x->f(x))=>[:B,:C])

but it doesnt work.

**Edit** If the function returns a vector of vectors, it is interpreted as a vector of *rows*, not columns. One solution is to return a matrix instead. Notice the space instead of a comma in `x->[x.+1 x.^2]`

```
julia> df
3Γ1 DataFrame
Row β x
β Int64
ββββββΌβββββββ
1 β 1
2 β 2
3 β 3
julia> transform!(df, :x=>(x->[x.+1 x.^2]) => [:y, :z])
3Γ3 DataFrame
Row β x y z
β Int64 Int64 Int64
ββββββΌβββββββββββββββββββββ
1 β 1 2 1
2 β 2 3 4
3 β 3 4 9
```

1 Like

Probably the one proposed by @skleinbo is the most typical way, but there are other ways

```
transform!(df, :x=>ByRow(r->(y=r+1, z=r^2))=>AsTable)
hcat(df,DataFrame(y=df.x.+1, z=df.x.^2))
insertcols!(df,2, :y=>df.x.+1, :z=>df.x.^2)
```

The function can return βvector of vectorsβ, but they are interpreted as rows:

```
julia> df = DataFrame(A=1:3)
3Γ1 DataFrame
Row β A
β Int64
ββββββΌβββββββ
1 β 1
2 β 2
3 β 3
julia> transform!(df, :A => (x -> [[v+1, v+2] for v in x]) => [:B, :C])
3Γ3 DataFrame
Row β A B C
β Int64 Int64 Int64
ββββββΌβββββββββββββββββββββ
1 β 1 2 3
2 β 2 3 4
3 β 3 4 5
```

The general format of expected output with multiple output columns is:

- If function returns one of AbstractDataFrame, NamedTuple, DataFrameRow, AbstractMatrix then columns are taken from the output columns.
- If function returns an AbstractVector then each element of this vector must support the keys function, which must return a collection of Symbols, strings or integers; the return value of keys must be identical for all elements. Then as many columns are created as there are elements in the return value of the keys function.
- If fun returns a value of any other type then it is assumed that it is a table conforming to the Tables.jl API and the Tables.columntable function is called on it.

4 Likes

I take this opportunity to ask for some more details on the choices made regarding the possible outputs of βfunβ.

```
#this
transform!(df, :x => (x -> [(v+1, v+2) for v in x]) => [:B, :C])
# is equivalent to this
transform!(df, :x => (x -> [(B=v+1, C=v+2) for v in x]) => AsTable)
```

my question is why this (array of namedtuples) works

```
transform!(df, :x=>ByRow(r->(y=r+1, z=r^2))=>AsTable)
```

and this (namedtuple of arrays) not

```
transform!(df, :x=>r->(y = r.+1, z = r.^2)=>AsTable)
```

It works, you just have forgotten parentheses:

```
julia> df = DataFrame(x=1:3)
3Γ1 DataFrame
Row β x
β Int64
ββββββΌβββββββ
1 β 1
2 β 2
3 β 3
julia> transform!(df, :x => (r->(y = r.+1, z = r.^2)) => AsTable)
3Γ3 DataFrame
Row β x y z
β Int64 Int64 Int64
ββββββΌβββββββββββββββββββββ
1 β 1 2 1
2 β 2 3 4
3 β 3 4 9
```

1 Like

I tried to compare the various ways, but they all seem equivalent.

I couldnβt figure out how insertcols performs, as @btime fails on the second pass because it already finds the columns with the same name.

Has the option in the insertcols function been evaluated to overwrite an existing column?

If so, why was it discarded?

use initialization code for `@btime`

.

This is the point of `insertcols!`

that it should error in this case. If you want to overwrite an existing column use `setindex!`

or `setproperty!`

(i.e. just write `df.col = vector`

or `df[!, col] = vector`

).

or pass `makeunique=true`

in `insertcols!`

.