Lag vector by group using another vector as the grouping variable

Agree, just make sure you don’t copy everything when you create the DataFrame:

julia> using DataFrames, BenchmarkTools

julia> firm_id = rand(["A", "B"], 100_000); revenue = rand(Int, 100_000); year = rand(2001:2004, 100_000);

julia> @btime DataFrame(firm_id = $firm_id, revenue = $revenue, year = $year);
  122.900 μs (34 allocations: 2.29 MiB)

julia> @btime DataFrame(firm_id = $firm_id, revenue = $revenue, year = $year; copycols = false);
  2.140 μs (27 allocations: 1.80 KiB)
3 Likes