Use only the x first letters of strings inside a columns of a DataFrame

Hi everyone,

I have a DataFrame with a column that contains strings. I would like to take all those strings, take only the nine first letters and save the resulting column of strings back into the DataFrame.

What is the best way to go about it?

Thanks a lot!

df[:column_name] = [s[1:min(length(s), 9)] for s in df[:column_name]]

2 Likes

You could try something along the lines of:
d = DataFrame(strings = ["asd", "jkl", "qwe", "iop", "bnm"])
d[:truncated] = map(s -> s[1:2], d[:strings])

This could fail for non-ASCII strings. In 0.7, you can use first(s, 9), and more generally you could grab the code from first and last with nchar by bkamins · Pull Request #23960 · JuliaLang/julia · GitHub and do s[1:min(endof(s), nextind(s, 8))]

1 Like

Since my next question is of the same spirit, I’ll post it here.

Let’s say I have a string of which I want to cut off x letters from the right. What comes before may vary in length, but the part on the right is always of the same length. How would I go about it?