Renaming multiple columns in Julia DataFrames

I have the following Julia Dataframe;

data_test = DataFrame(Time = 1:2, X1 = [0,1], X2 = [10,5])

And I have a list of names as follows;

technology = ["oil", "gas"]

How do I rename columns X1 and X2 using the list (excluding Time column). I can do it manually, however, this is not an efficient option to rename hundreds of columns. So, essentially, I’m what I’m looking for is a way to map the list of names to the columns. Any efficient solution is highly appreciated.

Thanks

One possible method is to use the high-level transforms of TableTransforms.jl. For example, you can Select the columns of interest with a regular expression, and then Rename them:

pipeline = Select(r"X*") β†’ Rename("X1" => "oil", "X2" => "gas", ...)

newdf = df |> pipeline

The transforms are implemented in terms of the Tables.jl API, we didn’t benchmark them against a DataFrames.jl-specific solution yet. Others can help with alternative solutions with different packages.

1 Like

Generalizing for an arbitrary number of columns using the built-in rename function from DataFrames.jl:

julia> rename!(data_test, ["X$i" => tech for (i, tech) in enumerate(technology)])
2Γ—3 DataFrame
 Row β”‚ Time   oil    gas
     β”‚ Int64  Int64  Int64
─────┼─────────────────────
   1 β”‚     1      0     10
   2 β”‚     2      1      5
3 Likes

Thanks @stillyslalom …this is what I was looking for.

One more alternative

@pipe replace.(names(data_test), names(data_test) .=> technology) |>
      rename!(data_test,_)

This method could come in handy if you need to use a regular expression. For example, if you want to change the names from X1,X2,X3… to Y1,Y2,Y3… you can do:

@pipe replace.(names(data_test), r"X" => "Y") |>
      rename!(data_test,_)
1 Like

use

rename!(data_test, names(data_test, r"X") .=> technology)
2 Likes