Hi all,
I am trying to create a function that takes as inputs df
, key
, and S
, where
-
df
: Data frame -
key
: Variable to group on (namely subject ID) -
S
: Number of strata
and creates a column in df
named stratum
specifies which row belongs to which stratum (s = 1, ..., S
).
For example I want to produce something like…
Row │ ID rᵢ y. v x1 x2 x0 stratum
─────┼───────────────────────────────────────────────────────────────────
1 │ 1 1 1.5831 true 0.0 -0.522896 1.0 1
2 │ 1 2 4.01085 true 0.0 -0.522896 1.0 2
3 │ 1 3 4.39671 true 0.0 -0.522896 1.0 3
4 │ 1 4 4.84606 true 0.0 -0.522896 1.0 4
5 │ 1 5 4.99947 true 0.0 -0.522896 1.0 4
6 │ 1 6 5.26577 false 0.0 -0.522896 1.0 4
7 │ 2 1 0.113026 true 0.0 -0.780132 1.0 1
8 │ 2 2 0.849384 true 0.0 -0.780132 1.0 2
9 │ 2 3 2.25784 true 0.0 -0.780132 1.0 3
10 │ 2 4 3.01167 true 0.0 -0.780132 1.0 4
11 │ 2 5 4.98009 true 0.0 -0.780132 1.0 4
12 │ 2 6 5.24923 true 0.0 -0.780132 1.0 4
13 │ 2 7 5.25211 true 0.0 -0.780132 1.0 4
14 │ 2 8 5.27893 true 0.0 -0.780132 1.0 4
15 │ 2 9 5.36605 true 0.0 -0.780132 1.0 4
16 │ 2 10 5.72365 false 0.0 -0.780132 1.0 4
17 │ 3 1 0.362733 true 0.0 0.768787 1.0 1
18 │ 3 2 4.03361 true 0.0 0.768787 1.0 2
19 │ 3 3 7.0183 true 0.0 0.768787 1.0 3
20 │ 3 4 9.27818 true 0.0 0.768787 1.0 4
21 │ 3 5 9.70474 true 0.0 0.768787 1.0 4
22 │ 3 6 9.84579 false 0.0 0.768787 1.0 4
Essentially we match stratum
with rᵢ
up until 4 and stratum = 4
for all rᵢ > 4
. This would coincide with an input of S = 4
.
So far, I have something like this coded up:
function createStrataByEvent(df :: DataFrame, key, S :: Int)
for s in 1:S
@chain df begin
groupby(Symbol(key))
@transform :stratum = ???
end
end
end
But am not sure how to create :stratum
. At first glance, it seems like this can be achieved using an ifelse statement; however, I am not sure how to code the else portion if this is the way forward.
Would appreciate any guidance on this.
Thanks,
Eric