I want to “flatten” a table with multiple cells that contain vectors (these have equal lengths per table-row):

```
t = table((group = [1,1,2,2], x = [1,2,3,4]))
t2 = groupby(t, :group, usekey = true) do k, r
(x2 = r.x .+ k.group, y2 = rand(length(r)))
end
```

If I flatten it on say column `x2`

, column `y2`

doesn’t get “flattened”:

```
julia> flatten(t2, :x2)
Table with 4 rows, 3 columns:
group x2 y2
group x2 y2
─────────────────────────────────
1 2 [0.154278, 0.31287]
1 3 [0.154278, 0.31287]
2 5 [0.0977201, 0.0372307]
2 6 [0.0977201, 0.0372307]
```

What I need is to “ungroup” table `t2`

by `group`

…

OK, this seems to work:

```
julia> merge(table.(rows(select(t2, Not(:group))))...)
Table with 4 rows, 2 columns:
x2 y2
─────────────
2 0.154278
3 0.31287
5 0.0977201
6 0.0372307
```

But alas, this works only if there were just two groups, cause merge works on just two tables.

OK, with `foldl`

this actually works for any number of groups:

```
julia> t = table((group = [1,1,2,2,3,3], x = [1,2,3,4,5,6]))
Table with 6 rows, 2 columns:
group x
────────
1 1
1 2
2 3
2 4
3 5
3 6
julia> t2 = groupby(t, :group, usekey = true) do k, r
(x2 = r.x .+ k.group, y2 = rand(length(r)))
end
Table with 3 rows, 3 columns:
group x2 y2
────────────────────────────────────
1 [2, 3] [0.0581229, 0.019784]
2 [5, 6] [0.63852, 0.855233]
3 [8, 9] [0.023636, 0.783005]
julia> foldl(merge, table.(rows(select(t2, Not(:group)))))
Table with 6 rows, 2 columns:
x2 y2
─────────────
2 0.0581229
3 0.019784
5 0.63852
6 0.855233
8 0.023636
9 0.783005
```

Or let your `groupby`

produce only a single column which is a `Tuple`

of values:

```
t = table((group = [1,1,2,2], x = [1,2,3,4]))
t2 = groupby(t, :group, usekey = true, flatten = true) do k, r
(z = [(r.x[i] + k.group, rand()) for i in 1:length(r)])
end
```

yields:

```
Table with 4 rows, 3 columns:
1 2 3
──────────────
1 2 0.798469
1 3 0.301879
2 5 0.883682
2 6 0.965717
```

piever
May 31, 2019, 5:29pm
#5
As proposed above you can return an object that iterates `NamedTuples`

. The easiest is to use `Columns`

, so this:

```
t = table((group = [1,1,2,2], x = [1,2,3,4]))
t2 = groupby(t, :group, usekey = true) do k, r
(x2 = r.x .+ k.group, y2 = rand(length(r)))
end
```

Would be replaced by

```
t = table((group = [1,1,2,2], x = [1,2,3,4]))
t2 = groupby(t, :group, usekey = true, flatten = true) do k, r
Columns(x2 = r.x .+ k.group, y2 = rand(length(r)))
end
```

That is the correct solution (for me). Awesome, thanks @piever !

BTW, why not `rows`

, that iterates named tuples as well, no?

piever
May 31, 2019, 5:39pm
#8
They return the same object: `rows((x=col1, y=col2))`

would also work. Either case actually returns a `StructVector`

(a vector of structs stored as a struct of vectors). When porting IndexedTables to StructArrays, we kept the `Columns`

name for backward compatibility (it is now an alias for `StructVector). OTOH it is a bit confusing that`

Columns(t::NamedTuple)`is the same as`

rows(t::NamedTuple)`, not sure what’s the correct way forward in terms of API.

1 Like