Apply a function in-place to a DataFrame column and return a new dataframe with replaced column

Hi, is there a function to apply a function to each value in specific column of a DataFrame, and then return a DataFrame with the new column replacing the old column even if it has a different eltype? Something like

mutate(df, col, my_function_that_operates_on_each_val_of_col)

For example, starting with

DataFrame(a = 1:3, b = ["2019 January", "2020 December", "2014 July"])

and using a function I made that operates on a single string to go to

DataFrame(a = 1:3, b = [Date(2019-01-01), Date(2020-12-01), Date(2014-07-01)])

I know I could do it using my_func.(df[:, :b]) but that doesn’t fit in a pipeline where each step returns a dataframe. I’ve looked at mapcols and transform and they don’t seem to fit.

I’d also be fine with a function that operates on a column and just adds the new column to the dataframe so I can drop the old column manually, if it’s not possible to change the datatype of an existing column.

1 Like

I don’t think that this is really an in-place operation as you are changing the type. Regardless,

transform!(df,"b" => ByRow(my_func) => "b")

should do what you want.

2 Likes