Hi there,
The split-apply-combine strategy is something that I use a lot so Im trying to understand a bit more what goes under the hood so that I can write better code. I was just looking at the following example:
df = DataFrame(x = rand(20), y = rand(20))
function test(df)
df |>
x -> transform!(x, :x => ByRow(val -> 2*val) => identity)
end
function test2(df)
df.x = 2 .* df.x
end
@btime
on test
gives
While @btime
on test2
gives
My first question is: where are the 3 allocations coming from in the first case and the more important second question: why is there such a huge difference between the piping + transform! implementation? I though it was an in-place method.
Thanks!