I would like to have clarification (or even simply comments) on the behavior of the flatten function in the case of the example of column y.
using DataFrames
df=DataFrame(x=rand(1:20,5),y=["aa",["a1","a2"],"bbb",[1,2,3],"cc"],z=[11,21, [31,32],41,51])
flatten(df,:z)
flatten(df,:y)
df.yvec=[isa(i,Array) ? i : [i] for i in df.y]
flatten(df,:yvec)
julia> flatten(df,:z)
6×3 DataFrame
Row │ x y z
│ Int64 Any Int64
─────┼────────────────────────────
1 │ 4 aa 11
2 │ 5 ["a1", "a2"] 21
3 │ 1 bbb 31
4 │ 1 bbb 32
5 │ 3 [1, 2, 3] 41
6 │ 5 cc 51
julia> flatten(df,:y)
12×3 DataFrame
Row │ x y z
│ Int64 Any Any
─────┼──────────────────────
1 │ 4 a 11
2 │ 4 a 11
3 │ 5 a1 21
4 │ 5 a2 21
5 │ 1 b [31, 32]
6 │ 1 b [31, 32]
7 │ 1 b [31, 32]
8 │ 3 1 41
9 │ 3 2 41
10 │ 3 3 41
11 │ 5 c 51
12 │ 5 c 51
I would have expected more a result like that of the yvec column
julia> df
5×4 DataFrame
Row │ x y z yvec
│ Int64 Any Any Array…
─────┼─────────────────────────────────────────────
1 │ 4 aa 11 ["aa"]
2 │ 5 ["a1", "a2"] 21 ["a1", "a2"]
3 │ 1 bbb [31, 32] ["bbb"]
4 │ 3 [1, 2, 3] 41 [1, 2, 3]
5 │ 5 cc 51 ["cc"]
julia> flatten(df,:yvec)
8×4 DataFrame
Row │ x y z yvec
│ Int64 Any Any Any
─────┼─────────────────────────────────────
1 │ 4 aa 11 aa
2 │ 5 ["a1", "a2"] 21 a1
3 │ 5 ["a1", "a2"] 21 a2
4 │ 1 bbb [31, 32] bbb
5 │ 3 [1, 2, 3] 41 1
6 │ 3 [1, 2, 3] 41 2
7 │ 3 [1, 2, 3] 41 3
8 │ 5 cc 51 cc