Easiest way to do "replace col1 = col2 if col3" in a dataframe

A very common operation in data manipulation: Update in-place col1 from col2 but only for rows where conditions hold for col3. If we stick to DataFrames, it has to be:

dt[dt.c, a] = dt[dt.c, b]

so we have to carry dt. all around, which can be very unhandy once dt is long.

The syntax in the title is from Stata. Thanks to its (unattractive) feature of handling only one dataframe at a time, it can avoid repeating dt.

In data.table from R, we could also do:

dt[c, a = b]
# (May not be the exact syntax - has been a while) 

Any hacks to achieve similiar simplicity in Julia?

A complicated version:

transform!(df, [:a, :b, :c] => ((a, b, c) -> ifelse.(c, b, a)) => :c)

A shorter version would use DataFramesMeta

@with df :a[:c] .= :b[:c]

This is definitely something we can focus on improving in the future.

Partially inspired by your second approach, I come up with this self-helping solution before it’s officially tackled by DataFramesMeta:

Define a view version of @where, called @within:

# code copied from DataFramesMeta with return being view:
macro within(x, args...)
    esc(within_helper(x, args...))
end

function within_helper(x, args...)
    t = (DataFramesMeta.fun_to_vec(arg; nolhs = true, gensym_names = true) for arg in args)
    quote
        $within($x, $(t...))
    end
end

function within(df::AbstractDataFrame, @nospecialize(args...))
    res = DataFrames.select(df, args...; copycols = false)
    tokeep = DataFramesMeta.df_to_bool(res)
    @view df[tokeep, :]
end

so we could do:

dt = DataFrame(a = [1, 1, 2, 2], b = rand(4), c = Inf * ones(4))
@with @within(dt, :a .> 1) :b .= :c

excel like solution:

      df.a .= (df.b .- df.a) .* df.c .+df.a