Dataframe : Length of column


#1

Hello,

If I have a DataFrame that for example is 91 rows 3 columns, and I would like to add a 4th column to it which measures the length of the data found in column 3. How can I apply this over each row please.

The following gives me 91 in each row, which isn’t what I’m looking for.

df[:colName] = length(df.Column3)

An example in python I would use the following below, which would measure the length of column3 on each row and input the number into column4.

df['Length'] = df.Column3.apply(len)

Many thanks.
ps. I have been trying to google and forum search. But the only answers I’m finding I cannot understand how to make it work, something to do with map.


#2

Ah sorry. I just figured out the solution thanks to this great post

DataFrames.jl - Vectorized row-wise function application

I ended up creating a function as

f(a) = length(a)

Then calling it on the new column

df[:ColName] = f.(df.Column3)

Not sure if this is the best way. Would appreciate to know if there are other ways.
Thanks


#3

Why not broadcast length directly?

df[:col_3_length] = length.(df.col_3)

#4

Thats great. I am not yet aware of this syntax in Julia (just learning it atm from a book).

Thank you.


#5

I would recommend working through the manual first, it is very nicely written.