I have a dataframe where I want to filter and grab a subset of rows. In this example, I want to grab the rows for each of :B where :C is the smallest for :B:
I tried creating an empty DataFrame to push!() to, but you can’t do that without specifying the column types first. My MVE is simple here, but I didn’t find it documented how I could create a new DataFrame based on the structure of an existing one without copying and deleting all the rows.
Note that the _ in the @map stands for each group. Each group here is a table, so first we extract the C column from that table with _.C, find the index of the smallest element in that column with argmin, and then index into the table to extract that full row with _[row_number]. So for each group @map returns one row, and then in turn constitutes a new table, which we materialize into a DataFrame.
If we had this the @map part could be written as @map(minimum(_, i->i.C)), which would be a bit more elegant.
You can use similar(df, 0) to create a data frame with the same columns as df. With this PR it will even be possible to push rows to an empty data frame and have new columns be created automatically.
But in general the solutions proposed in other comments are probably better.