I’m not sure I understand what you’re trying to do, but if it’s “in every row, put the sum of the value
column for the two rows above, except when there aren’t two above, in which case put missing
”, you can combine RollingFunctions
with ShiftedArrays
like this:
julia> using DataFrames, ShiftedArrays, RollingFunctions
julia> df = DataFrame(region = ["US","US","US","US","EU","EU","EU","EU"],
year = [2010,2011,2012,2013,2010,2011,2012,2013],
value = [3,3,2,2,2,2,1,1]);
julia> transform(groupby(df, :region),
:value => (x -> running(sum, lag(x, 1), 2)) => :cumval)
8×4 DataFrame
Row │ region year value cumval
│ String Int64 Int64 Float64?
─────┼─────────────────────────────────
1 │ US 2010 3 missing
2 │ US 2011 3 missing
3 │ US 2012 2 6.0
4 │ US 2013 2 5.0
5 │ EU 2010 2 missing
6 │ EU 2011 2 missing
7 │ EU 2012 1 4.0
8 │ EU 2013 1 3.0
So EU in 2013 is 3 (sum of 2012 and 2011), in 2012 is 4 (sum of 2010 and 2011) while in 2011 it’s missing as there’s no 2009 observation.