Pull DataFrames columns to the front

I often need to pull certain DataFrames columns to the front and do this via replace with Not. I wonder if there is a more readable way to do it.

julia> using DataFrames

julia> df = DataFrame(D=[1], E=[2], C=[3], A=[4], B=[5])
1×5 DataFrame
 Row │ D      E      C      A      B
     │ Int64  Int64  Int64  Int64  Int64
─────┼───────────────────────────────────
   1 │     1      2      3      4      5

julia> cols = [:A, :B];

julia> select(df, cols, Not(cols))
1×5 DataFrame
 Row │ A      B      D      E      C
     │ Int64  Int64  Int64  Int64  Int64
─────┼───────────────────────────────────
   1 │     4      5      1      2      3

https://stackoverflow.com/questions/47694704/how-do-i-change-the-order-of-columns-in-a-julia-dataframe

answered by the author of DF, so yeah, what you’re doing is close to canonical already

Unlike that question, I’m asking it more in general. What if you want cols::Vector to be placed in the front. Doing something with indexes is not really convenient. Even worse, what if you want the cols to be placed at position 2?

The reason that I ask is because select(df, cols, Not(cols)) is not so readable and I’ve learned that whenever I need to make a function for something which seems like basic DataFrames functionality, I’m probably doing something wrong.

I would have expected something like movecols!(df, 1, cols) similar to insertcols!.

1 Like

I find select(df, cols, :) quite readable. Similarly, to insert them at position 2 you can use select(df, 1, cols, :). Or for position 3, select(df, 1:2, cols, :).

6 Likes

Exactly as @sijo explains there is no need to use Not. In this case : is treated as every column except what was already included up to this point.

The general design goal of DataFrames.jl is to minimize the number of verbs the user has to learn so select and select! were judged to be enough as they should be flexible enough in most cases.

4 Likes

I’m amazed!

It never occurred to me that that was the case. Makes complete sense! Awesome design.

Now that I’ve seen select(df, 1, cols, :), I completely agree.

Thanks for the responses, everyone

2 Likes