If a dataframe is defined as:
df_dummy = DataFrame(A=[1,2,3],B=["A","B","C"],ID =[121,321,421])
How can I change the order of one or more columns and modify the dataframe in place? If I want to make ID the first column and B the last column.
1 Like
julia> df_dummy = DataFrame(A=[1,2,3],B=["A","B","C"],ID =[121,321,421])
3Γ3 DataFrame
Row β A B ID
β Int64 String Int64
ββββββΌββββββββββββββββββββββ
1 β 1 A 121
2 β 2 B 321
3 β 3 C 421
julia> select!(df_dummy, :ID, Not([:ID, :B]), :B)
3Γ3 DataFrame
Row β ID A B
β Int64 Int64 String
ββββββΌββββββββββββββββββββββ
1 β 121 1 A
2 β 321 2 B
3 β 421 3 C
Note that select!(df_dummy, :ID, Not(:B), :B)
is a bit shorter and would also work because the selection is evaluated left to right. But I think explicitly excluding :ID
is more clear in conveying the intention.
Similarly if you knew you have exactly 3 columns you can just write select!(df_dummy, :ID, :A, :B)
. The expression I have given above is general and would allow any number of columns in the middle.
12 Likes
Thank you. I use select
a lot but never realised it can be used this way
We try to minimize the number of functions that users need to learn as they are already quite complex .
3 Likes
I donβt know if it is exactly the columns permutation you are looking for, but it could be an equivalent way of achieving the same result
select!(df_dummy,circshift(names(df_dummy),1))
#or
df_dummy=df_dummy[:,circshift(names(df_dummy),1)]