My Stata benchmark wasn’t quite right. Stata doesn’t support multiple in memory datasets, so the by operation I benchmarked was actually a by plus a join.
Regarding joins, for DataTables:
dt2 = by(dt, :B, d -> mean(d[:A]))
join(dt, dt2, on = :B)
For pandas:
df2 = mean(groupby(df, "B"))
df3 = merge(df, df2, left_on = "B", right_index = true)
I suspect there is some way to do this all in one go, like with broadcasting and the dot notation in Julia.
DataTables: 459s
Pandas: 14s