How to change the data type of a specific column of a dataframe?

Hi

I have a dataframe of 5 columns. All of them are of type “string”.
I want to convert column 2 (“count”) from String to Int 16.

So far I have tried:

df_splunkie[:count]=Int16(df_splunkie[:count])

This is the error:

┌ Warning: `getindex(df::DataFrame, col_ind::ColumnIndex)` is deprecated, use `df[!, col_ind]` instead.
│   caller = top-level scope at In[94]:4
└ @ Core In[94]:4

MethodError: no method matching Int16(::Array{String,1})
Closest candidates are:
  Int16(!Matched::Union{Bool, Int32, Int64, UInt32, UInt64, UInt8, Int128, Int16, Int8, UInt128, UInt16}) at boot.jl:708
  Int16(!Matched::Float32) at float.jl:685
  Int16(!Matched::Float64) at float.jl:685
  ...

Stacktrace:
 [1] top-level scope at In[94]:4

Also this:

df_splunkie[!, 2]= Int16(df_splunkie[!, 2])

And this is the error:

MethodError: no method matching Int16(::Array{String,1})
Closest candidates are:
  Int16(!Matched::Union{Bool, Int32, Int64, UInt32, UInt64, UInt8, Int128, Int16, Int8, UInt128, UInt16}) at boot.jl:708
  Int16(!Matched::Float32) at float.jl:685
  Int16(!Matched::Float64) at float.jl:685
  ...

Stacktrace:
 [1] top-level scope at In[95]:4

You are correct that you need to use df[!,:name] to access a particular column, not just df[:name]. You are missing that you cannot “directly” convert from string to integer. You need to use parse.

df[!,2] .= parse.(Int,df[!,2])

I think that should do it?

What does “parse” do? What is the difference from “convert”?

You should type ? parse to see the docs. The first line says “Parse a string as a number.”

convert is for number-to-number type conversions (Float -> Int, etc.). It can also be used to change array types.

1 Like