Hello,
I would like to transform a DataFrame like this :
df = DataFrame(a=[1,2],b=["yes","no"],c=[["a","b","c"],["d"]])
┌───────┬────────┬─────────────────┐
│ a │ b │ c │
│ Int64 │ String │ Vector{String} │
├───────┼────────┼─────────────────┤
│ 1 │ yes │ ["a", "b", "c"] │
│ 2 │ no │ ["d"] │
└───────┴────────┴─────────────────┘
… into a DataFrame like the one below, where all values of columns a & b are repeated for every element in columns c arrays, in an efficient way (i have millions of rows and about a hundred elements in those c columns arrays) :
goal = DataFrame(a=[1,1,1,2],b=["yes","yes","yes","no"],c=["a","b","c","d"])
┌───────┬────────┬────────┐
│ a │ b │ c │
│ Int64 │ String │ String │
├───────┼────────┼────────┤
│ 1 │ yes │ a │
│ 1 │ yes │ b │
│ 1 │ yes │ c │
│ 2 │ no │ d │
└───────┴────────┴────────┘
If anyone also know the name of this kind of transformation, unsubpivoting maybe ?
Any help is appreciated !