A very common operation in data manipulation: Update in-place col1 from col2 but only for rows where conditions hold for col3. If we stick to DataFrames, it has to be:
dt[dt.c, a] = dt[dt.c, b]
so we have to carry dt. all around, which can be very unhandy once dt is long.
The syntax in the title is from Stata. Thanks to its (unattractive) feature of handling only one dataframe at a time, it can avoid repeating dt.
In data.table from R, we could also do:
dt[c, a = b]
# (May not be the exact syntax - has been a while)
Any hacks to achieve similiar simplicity in Julia?
Partially inspired by your second approach, I come up with this self-helping solution before it’s officially tackled by DataFramesMeta:
Define a view version of @where, called @within:
# code copied from DataFramesMeta with return being view:
macro within(x, args...)
esc(within_helper(x, args...))
end
function within_helper(x, args...)
t = (DataFramesMeta.fun_to_vec(arg; nolhs = true, gensym_names = true) for arg in args)
quote
$within($x, $(t...))
end
end
function within(df::AbstractDataFrame, @nospecialize(args...))
res = DataFrames.select(df, args...; copycols = false)
tokeep = DataFramesMeta.df_to_bool(res)
@view df[tokeep, :]
end