I have a dataset with a bunch of columns with string data that have the format “(1.0, 2.0)”. All of these columns end with the postfix “CI”. I’d like to apply a function that splits the numbers into the low and high value and parses them as floats. Then I’d like to replace the original columns with $(original_name)_low and $(original_name)_low.
I know how I would do this with DataFrames.jl DSL, but I’d really like to achieve this with TiderData.jl because it makes my code more accessible for my R peers.
Any chance it’s possible without hard coding all the column names?
here is one possible solution where you turn the string into a tuple or an array and then use @unnest_wider. @kdpsingh may be able to offer others that are better too
Now that you have a solution with TidierData I hope it’s not annoying to post a solution with (disclaimer: my package) DataFrameMacros, it’s a lot of fun to try and solve these data wrangling “code golfing” problems and I was happy to find a one-liner for this:
The string.({}, ["_low", "_high"]) part expands to [["a_CI_low", "a_CI_high"], ["b_CI_low", "b_CI_high"]] so in DataFrames minilanguage it’s like having [:a_CI, :b_CI] .=> the_function .=> [["a_CI_low", "a_CI_high"], ["b_CI_low", "b_CI_high"]].