How can I compute a variable with @combine only if some condition is met?
MWE:
using DataFrames, DataFramesMeta
dfTmp = DataFrame(y = 1:3, x = 10:12)
@combine dfTmp begin
# Always compute `a`
:a = :y * 2;
# Only compute `b` if `x` is present
:b = (:x in names(dfTmp)) ? (:x .+ 1) : zeros(nrow(dfTmp));
end
# Now remove `b` if `x` was not present
This is slightly clumsy and it fails if x is not in names(dfTmp).
The alternative is to implement a 2nd combine statement just for b and to join the resulting DataFrames. But that seems clumsy and inefficient.
using TidierData
dftmp = DataFrame(y = 1:3, x = 10:12)
if "x" in names(dftmp)
@transmute(dftmp, a = y * 2, b = x + 1)
else
@transmute(dftmp, a = y * 2)
end
I am hoping to avoid duplicating all the code that computes variables in all cases (the a = y * 2 in this case).
This is especially important if there are multiple variables that may be missing from the DataFrame.
But thanks for the suggestion.
using TidierData
df = DataFrame(y = 1:3, x = 10:12)
@chain df begin
@mutate(a = y * 2)
if "x" in names(_) @mutate(_, b = x + 1) else _ end
# more conditions here...
end
3Γ4 DataFrame
Row β y x a b
β Int64 Int64 Int64 Int64
ββββββΌββββββββββββββββββββββββββββ
1 β 1 10 2 11
2 β 2 11 4 12
3 β 3 12 6 13
Just in case itβs not clear, the _ is a placeholder within @chain. If the condition is not met, having the else _ ensures that you return the data frame without modifications so you can continue the chain. @mutate uses DataFrames.transform() under the hood.
That works (even if x is not present, while DataFramesMeta does not work in that case).
I will mark it as the solution, even though I would prefer a solution that does not require me to write that one section of the code in the Tidier syntax, while Iβm using DataFramesMeta everywhere else.
Thank you for the suggestion.
julia> using DataFrames, DataFramesMeta
julia> dfTmp = DataFrame(y = 1:3, x = 10:12)
3Γ2 DataFrame
Row β y x
β Int64 Int64
ββββββΌββββββββββββββ
1 β 1 10
2 β 2 11
3 β 3 12
julia> @combine dfTmp $AsTable = begin
a = :y * 2
if "x" in names(dfTmp)
b = :x .+ 1
(; a, b)
else
(; a)
end
end
3Γ2 DataFrame
Row β a b
β Int64 Int64
ββββββΌββββββββββββββ
1 β 2 11
2 β 4 12
3 β 6 13
Edit: no this fails if x doesnβt exist because the created function is given to DataFrames with args :y and :x
Happy to provide a DataFramesMeta solution. Should be similar. Will take a look.
Does this work? (not at a computer)
using DataFramesMeta, Chain
df = DataFrame(y = 1:3, x = 10:12)
@chain df begin
@rtransform(:a = :y * 2)
if "x" in names(_) @rtransform(_, :b = :x + 1) else _ end
# more conditions here...
end
Thatβs an interesting option. Instead of writing different combinations of f, one could just put if else logic into f (otherwise, writing out several versions of f could be tedious). I would have to check performance, though. Thanks.