Specifically, I have a DataFrame and one column has data type Int64; it has the unique values 0 and 1 (meaning obviously false and true). What is the easiest quickest way to have the column including its entries converted to the boolean data type?
I thought about a kind of loop, but are there already existing functions available that I should know of?
More generally, please explain how to best tackle this (especially conceptually), if using a self-made solution.
My thought then might be to take the whole array/column, check every value, make a new array based on set conditions (if 0, make false; if 1, make true, etc.), mutate or add the new array into the dataframe.
I suppose I can post this here, since it concerns a similar issue.
There is a dataframe. It has a String column with missing values. Its values are actually integers.
What is the most direct and easiest way to convert this whole column of String (with missing) to one of Int64 (with missing)?
I thought of your generic way, @tbeason, but it seems it requires more in this case. I thought there was a function to convert such strings to int, but I could be mistaken.
Could you copy and paste par of your column into this thread? You probably want tryparse which will return nothing if parse finds a column like "Id209.4", which can’t be parsed as a float.
I tried df[:2,:Id_internal ] = tryparse.(Int64,df[:2,:Id_internal ])
and df[:2,:Id_internal ] = parse.(Int64,df[:2,:Id_internal ])
both gave me ERROR: setindex! not defined for WeakRefStrings.StringArray{String,1}
I read my data as follow: df_all = CSV.File("file.csv", delim = '\t' |> DataFrame
I then I create a df with what I need df = df_all[[:Id_internal, :Date]]
That is indeed a very odd error message. To be honest I don’t know exactly why you are getting it. But note that you should be writing df[:, :Id_internal], notdf[:2, :Id_internal]
cc @quinnj for why the user might have gotten such an odd error. I can’t replicate it.
This is old, deprecated, syntax. Its a concern that people are still finding this syntax in tutorials. Can you please post a link to the guide you are using to learn DataFrames?