trying to understand the role of kwarg isgathered, I practiced this exercise.
julia> ds = Dataset(id = [1,1,1,1,1,2,2,2,3,3,3],
date = Date.(["2019-03-05", "2019-03-12", "2019-04-10",
"2019-04-29", "2019-05-10", "2019-03-20",
"2019-04-22", "2019-05-04", "2019-11-01",
"2019-11-10", "2019-12-12"]),
outcome = [false, false, false, true, false, false,
true, false, true, true, true])
11Γ3 Dataset
...
julia> gb=gatherby(ds, [1, 3], isgathered = true)
11Γ3 View of GatherBy Dataset, Gathered by: id ,outcome
id date outcome
identity identity identity
Int64? Date? Bool?
ββββββββββββββββββββββββββββββββ
1 2019-03-05 false
1 2019-03-12 false
1 2019-04-10 false
1 2019-04-29 true
1 2019-05-10 false
2 2019-03-20 false
2 2019-04-22 true
2 2019-05-04 false
3 2019-11-01 true
3 2019-11-10 true
3 2019-12-12 true
julia> combine(gb, (:) => last, dropgroupcols = true)
7Γ3 Dataset
...
I was wondering how many groups are generated by
julia> gb = gatherby (ds, [1, 3], isgathered = true)
because the output header does not report the information:
11 Γ 3 View of GatherBy Dataset, Gathered by: id, outcome
I saw that the info can be obtained in the following way
last (gb.groups),
but I donβt think itβs a recommended procedure.
Later, having read that groupby and gatherby accept the output of other grouping functions as input, I tried (in all the variations that came to my mind ), unsuccessfully, to do something like this:
combine (gb, (:) => x-> groupby (x, 1)).
so I tried to build the subgroups by hand and, after a few attempts, I found this βsolutionβ
cgb1 = combine (gb, (:) => x -> [x], (:) => byrow (x-> Dataset (; zip ([: a,: b,: c], x) ...) ), dropgroupcols = true)
At this point the question arises why the following βreducedβ expression does not work
cgb1 = combine (gb, (:) => byrow (x-> Dataset (; zip ([: a,: b,: c], [x]) ...)), dropgroupcols = true)