what does this mean? (i.e. what do you want it to do?) For example, what even is a,b? If you want to do something and donβt care about the argument, you should
julia> x
(a = 1, b = 2)
julia> values(x)
(1, 2)
julia> sum(x)
3
Background β
For DataFrames, Iβd like an easy way to do row-level operations, and use the column names as variable names and have any created variables added automatically as new columns.
in this discussing a nice method was suggested
but it requires writing a function where you have to specify the column names as inputs and outputs.
function f(; a, b, ... )
c = a + b
d = a - b
(; a, b, c, d)
end
D = DataFrame( a=[1,2], b=[3.4) )
@chain D begin
transform( AsTable(:) => ByRow(x->f(x...)) => AsTable )
end
Iβd like to avoid re-specifying the column names at all. Something like:
@chain D begin
***drop to row level**** begin
c = a + b
d = a - b
end
end
No, you canβt avoid the colons. Columns are referenced as :x.
The reason for this is that we need a way to distinguish, at parse time, the columns in a data frame from other variables. Without knowing anything about the data frame.
x = 1
@rtransform df :y = :x + x
Obviously itβs possible to make unquoted symbols, i.e. x column references and leave special syntax for everything else. But then you would hvae to apply lots of escaping rules.
@rtransform df y = begin
$x = 100
x + z
end
This might get out of hand when people want to use missing, map with a function as the first argument, etc.
I would put a positive spin on this and say the use of :x makes code more readable because you can distinguish easily between columns and variables, which can get confusing in dplyr.
Finally, @rsubset exists in 0.9.0 and newer. @where is deprecated in favor of subset, so there is just one filtering function in DataFramesMeta.jl.
Last question. Within an @rtransform bloc new columns canβt depend on other new columns.
Is there a way to have a row level block with multiple interdependent new columns?