Suppose I have a Dataframe that corresponds to a multidimensional function. Something like this: df = DataFrame([(a = x, b = y, c = z,d = x+y+z) for x in 1:6 for y in 1:4 for z in 1:2])
What is the best way to transform it into the following multidimensional array [x+y+z for x in 1:6, y in 1:4 , z in 1:2]
Currently I’m doing it with groupby but it gets complicated with higher dimensions (I’m a Matlab user, so I used to work with multidimensional arrays).
Note that for x in 1:6 for y in 1:4 for z in 1:2 has exactly the opposite order of dimensions than for x in 1:6, y in 1:4, z in 1:2 which is why the permutedims is needed.
Depending on how you obtain the data in the first place, you may refactor that process to return a multidimensional array instead of a dataframe. Arrays are indeed easy and convenient to use in julia, and they are more general.
But for this particular operation, there’s a nice table → multi dim array conversion function in AxisKeys.jl:
Can this be combined with DataFrames groupby or @by to wrap a sub-set of the dimensions, to produce a DataFrame where one of the columns is a KeyedArray.
The DataFrames guys are working on nest and unnest which would make this easier but they aren’t released yet.