Split Column in Dataframe

I am trying to figure out how split a column of strings into a set number of columns. I have the example below. I want to split the column name from the titanic dataset into four columns. I know I can use this code to create one column called LastName: df.LastName = first.(split.(df.Name," ") and can do the same to get the last value just replace first with last. However, I cannot figure out a way to get the rest of the values. In the below code performing the split creates a substring vector I cannot figure out a way to indice or iterate through it to get the values I want to split them into several columns.

Are they all the same length? If so you can do

transform(df, :Name => ByRow(split) => AsTable)

then rename them after

in DataFramesMeta this is

@rtransform df $AsTable = split(:Name)

EDIT: Even better

transform(df, :Name => ByRow(split) => [:first, :middle, :last])

or

@rtransform df $[:first, :middle, :last] = split(:Name)
3 Likes

Thank you pdeffebach this was the fix. I wanted to mention something and ask a question. I see that you are one of the major contributors to this package for Julia and I must say this package is great, it functions a lot like R and the syntax and paradigm is quite easy to understand. I am new to programming and Julia and I am really liking the potential and current state of Julia is there a way for me to contribute for one to the DataFramesMeta package specifically and Julia as a whole, especially being I would say very novice programmer?

3 Likes

Those are very kind words!

I think the best thing would be filing issues when you have problems or have ideas for new features.

Additionally, if you have a tutorial in another language that you really like, porting it to DataFramesMeta and having it be part of the docs would be very cool.

3 Likes

Okay, I will do that, thanks for the help.