Hello everyone,
I was wondering if it were possible to lag a vector by group, when the grouping variable is a separate vector.
Here’s a minimum working example:
firm_id = ["A", "A", "A", "A", "B", "B", "B", "B"]
revenue = [100, 200, 300, 400, 50, 67, 75, 90]
year = [2001,2002,2003,2004,2001,2002,2003,2004]
where you can think of the revenue as a panel data that corresponds to each firm.
I’d like to lag the revenue variable by firm_id, to get the following result:
revenue_lag = [missing, 100, 200, 300, missing, 50, 67, 75]
In my project I have to perform this operation repeatedly for different values of the parameter I’m optimizing over.
I realize that this can be done by putting firm_id and revenue into a DataFrame and then lagging it, but I was wondering if it were possible to do so without creating a DataFrame every time I want to do this. My (uninformed) guess is that creating a DataFrame everytime to lag is more time-consuming, please correct me if I’m wrong about this.
I saw that the ShiftedArrays package has a lag() function but couldn’t figure out how it could be applied by a grouping vector.
Thanks!