There are many questions, so let me try showing what I think you want (if I missed something please comment).
Variant 1: get for each variable and for each group a data frame with the result:
julia> combine(groupby(df, :gr), names(df, Number) .=> (x -> Ref(DataFrame(q=0.0:0.1:1.0, v=quantile(x, 0.0:0.1:1.0)))) => x -> x * "_DataFrame")
4ร4 DataFrame
Row โ gr x1_DataFrame x2_DataFrame x3_DataFrame
โ Char DataFrame DataFrame DataFrame
โโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 โ A 11ร2 DataFrame 11ร2 DataFrame 11ร2 DataFrame
2 โ B 11ร2 DataFrame 11ร2 DataFrame 11ร2 DataFrame
3 โ C 11ร2 DataFrame 11ร2 DataFrame 11ร2 DataFrame
4 โ D 11ร2 DataFrame 11ร2 DataFrame 11ร2 DataFrame
(instead of Ref
you could wrap with [...]
also, but Ref
is a standard way in Base Julia broadcasting of turning any value into a scalar, so it is easier to remember)
Variant 2: expand the data frames into columns but still keeping the number of rows equal to number of groups:
julia> combine(groupby(df, :gr), names(df, Number) .=> (x -> Ref(DataFrame(q=0.0:0.1:1.0, v=quantile(x, 0.0:0.1:1.0)))) => x -> x .* ["_q", "_v"])
4ร7 DataFrame
Row โ gr x1_q x1_v x2_q x2_v x3_q โฏ
โ Char Arrayโฆ Arrayโฆ Arrayโฆ Arrayโฆ Arrayโฆ โฏ
โโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 โ A [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0โฆ [0.0100206, 0.105031, 0.146141, โฆ [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0โฆ [0.026531, 0.106821, 0.25079, 0.โฆ [0.0, 0.1, 0.2, โฏ
2 โ B [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0โฆ [0.0201708, 0.14565, 0.178781, 0โฆ [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0โฆ [0.0379883, 0.0499197, 0.163857,โฆ [0.0, 0.1, 0.2,
3 โ C [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0โฆ [0.0320945, 0.12889, 0.188354, 0โฆ [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0โฆ [0.0560529, 0.166095, 0.221287, โฆ [0.0, 0.1, 0.2,
4 โ D [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0โฆ [0.0154812, 0.046749, 0.167502, โฆ [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0โฆ [0.0325466, 0.15281, 0.280586, 0โฆ [0.0, 0.1, 0.2,
Variant 3: as variant 2, but expand to as many rows as quantiles (for each variable keep a separate quantile column as in general it could be different)
julia> combine(groupby(df, :gr), names(df, Number) .=> (x -> DataFrame(q=0.0:0.1:1.0, v=quantile(x, 0.0:0.1:1.0))) => x -> x .* ["_q", "_v"])
44ร7 DataFrame
Row โ gr x1_q x1_v x2_q x2_v x3_q x3_v
โ Char Float64 Float64 Float64 Float64 Float64 Float64
โโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 โ A 0.0 0.0100206 0.0 0.026531 0.0 0.0304922
2 โ A 0.1 0.105031 0.1 0.106821 0.1 0.116344
3 โ A 0.2 0.146141 0.2 0.25079 0.2 0.160595
4 โ A 0.3 0.239598 0.3 0.275699 0.3 0.223479
5 โ A 0.4 0.418623 0.4 0.391514 0.4 0.283464
6 โ A 0.5 0.479909 0.5 0.463614 0.5 0.350202
7 โ A 0.6 0.661491 0.6 0.478091 0.6 0.421991
8 โ A 0.7 0.709587 0.7 0.626841 0.7 0.464356
9 โ A 0.8 0.778766 0.8 0.721748 0.8 0.581408
10 โ A 0.9 0.922159 0.9 0.941598 0.9 0.762324
11 โ A 1.0 0.986275 1.0 0.995137 1.0 0.923933
12 โ B 0.0 0.0201708 0.0 0.0379883 0.0 0.0213256
13 โ B 0.1 0.14565 0.1 0.0499197 0.1 0.163012
โฎ โ โฎ โฎ โฎ โฎ โฎ โฎ โฎ
33 โ C 1.0 0.976539 1.0 0.902687 1.0 0.838742
34 โ D 0.0 0.0154812 0.0 0.0325466 0.0 0.0327547
35 โ D 0.1 0.046749 0.1 0.15281 0.1 0.187093
36 โ D 0.2 0.167502 0.2 0.280586 0.2 0.31834
37 โ D 0.3 0.236399 0.3 0.328495 0.3 0.389418
38 โ D 0.4 0.31478 0.4 0.452312 0.4 0.428514
39 โ D 0.5 0.325001 0.5 0.527466 0.5 0.526287
40 โ D 0.6 0.448259 0.6 0.591354 0.6 0.546437
41 โ D 0.7 0.573627 0.7 0.672257 0.7 0.613559
42 โ D 0.8 0.802225 0.8 0.720313 0.8 0.730664
43 โ D 0.9 0.893947 0.9 0.890409 0.9 0.908996
44 โ D 1.0 0.931859 1.0 0.949966 1.0 0.977479
19 rows omitted
Variant 4: as variant 3, but single quantile column
julia> combine(groupby(df, :gr), names(df, Number) .=> (x -> quantile(x, 0.0:0.1:1.0)) => x -> x .* "_v", Returns((q=0.0:0.1:1.0,)))
44ร5 DataFrame
Row โ gr x1_v x2_v x3_v q
โ Char Float64 Float64 Float64 Float64
โโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 โ A 0.0100206 0.026531 0.0304922 0.0
2 โ A 0.105031 0.106821 0.116344 0.1
3 โ A 0.146141 0.25079 0.160595 0.2
4 โ A 0.239598 0.275699 0.223479 0.3
5 โ A 0.418623 0.391514 0.283464 0.4
6 โ A 0.479909 0.463614 0.350202 0.5
7 โ A 0.661491 0.478091 0.421991 0.6
8 โ A 0.709587 0.626841 0.464356 0.7
9 โ A 0.778766 0.721748 0.581408 0.8
10 โ A 0.922159 0.941598 0.762324 0.9
11 โ A 0.986275 0.995137 0.923933 1.0
12 โ B 0.0201708 0.0379883 0.0213256 0.0
13 โ B 0.14565 0.0499197 0.163012 0.1
โฎ โ โฎ โฎ โฎ โฎ โฎ
33 โ C 0.976539 0.902687 0.838742 1.0
34 โ D 0.0154812 0.0325466 0.0327547 0.0
35 โ D 0.046749 0.15281 0.187093 0.1
36 โ D 0.167502 0.280586 0.31834 0.2
37 โ D 0.236399 0.328495 0.389418 0.3
38 โ D 0.31478 0.452312 0.428514 0.4
39 โ D 0.325001 0.527466 0.526287 0.5
40 โ D 0.448259 0.591354 0.546437 0.6
41 โ D 0.573627 0.672257 0.613559 0.7
42 โ D 0.802225 0.720313 0.730664 0.8
43 โ D 0.893947 0.890409 0.908996 0.9
44 โ D 0.931859 0.949966 0.977479 1.0
19 rows omitted
(here note the way to return a constant column not depending on anything - you create a named tuple with the column name you want and just wrap it in Returns
.