DataFrames: convert column data type

Was looking around, but didn’t find an answer.

Specifically, I have a DataFrame and one column has data type Int64; it has the unique values 0 and 1 (meaning obviously false and true). What is the easiest quickest way to have the column including its entries converted to the boolean data type?

I thought about a kind of loop, but are there already existing functions available that I should know of?

More generally, please explain how to best tackle this (especially conceptually), if using a self-made solution.

My thought then might be to take the whole array/column, check every value, make a new array based on set conditions (if 0, make false; if 1, make true, etc.), mutate or add the new array into the dataframe.

df.int_col .== 1 will return a BitArray column

1 Like

Very useful. Thanks.

A more general way to do this is (assuming the column is called x)

df[!,:x] = convert.(Bool,df[!,:x])
3 Likes

Thanks for the reply.

I suppose I can post this here, since it concerns a similar issue.

There is a dataframe. It has a String column with missing values. Its values are actually integers.

What is the most direct and easiest way to convert this whole column of String (with missing) to one of Int64 (with missing)?

I thought of your generic way, @tbeason, but it seems it requires more in this case. I thought there was a function to convert such strings to int, but I could be mistaken.

I’m getting some progress. I found the function parse().

Unfortunately parse doesn’t work with missings. You are looking for passmissing from Missings.jl.

julia> df.col = passmissing(parse).(Int, df.col)
1 Like

Thanks.

I believe that you could just have

df[:,:x] = convert.(Bool,df[!,:x])

(notice that : instead of !) to avoid making two copies.

My mistake, it has to be the other way around:

df[!,:x] = convert.(Bool,df[:,:x])