Transforming DataFrame from column of vector

Hello,

I would like to transform a DataFrame like this :

df = DataFrame(a=[1,2],b=["yes","no"],c=[["a","b","c"],["d"]])
┌───────┬────────┬─────────────────┐
│     a │      b │               c │
│ Int64 │ String │  Vector{String} │
├───────┼────────┼─────────────────┤
│     1 │    yes │ ["a", "b", "c"] │
│     2 │     no │           ["d"] │
└───────┴────────┴─────────────────┘

… into a DataFrame like the one below, where all values of columns a & b are repeated for every element in columns c arrays, in an efficient way (i have millions of rows and about a hundred elements in those c columns arrays) :

goal = DataFrame(a=[1,1,1,2],b=["yes","yes","yes","no"],c=["a","b","c","d"])
┌───────┬────────┬────────┐
│     a │      b │      c │
│ Int64 │ String │ String │
├───────┼────────┼────────┤
│     1 │    yes │      a │
│     1 │    yes │      b │
│     1 │    yes │      c │
│     2 │     no │      d │
└───────┴────────┴────────┘

If anyone also know the name of this kind of transformation, unsubpivoting maybe ? :sweat_smile:
Any help is appreciated !

1 Like

I think you want flatten

2 Likes

flatten(df, :c)

3 Likes

Perfect, thank you! Loving Julia with DataFrames a bit more every day :slight_smile: