Use only the x first letters of strings inside a columns of a DataFrame


#1

Hi everyone,

I have a DataFrame with a column that contains strings. I would like to take all those strings, take only the nine first letters and save the resulting column of strings back into the DataFrame.

What is the best way to go about it?

Thanks a lot!


#2

df[:column_name] = [s[1:min(length(s), 9)] for s in df[:column_name]]


#3

You could try something along the lines of:
d = DataFrame(strings = ["asd", "jkl", "qwe", "iop", "bnm"])
d[:truncated] = map(s -> s[1:2], d[:strings])


#4

This could fail for non-ASCII strings. In 0.7, you can use first(s, 9), and more generally you could grab the code from https://github.com/JuliaLang/julia/pull/23960 and do s[1:min(endof(s), nextind(s, 8))]


#5

Since my next question is of the same spirit, I’ll post it here.

Let’s say I have a string of which I want to cut off x letters from the right. What comes before may vary in length, but the part on the right is always of the same length. How would I go about it?