How to format the type of a column in Julia?

Hello,
I have a dataframe and I would like to set the type of each column. In my case, all columns are strings. How do I, for instance, format to numeric?
I am looking for a Julia version of R’s df$x <- as.numeric(as.character(df$x)). In my case, I tried with:

julia> df[df.Bact .== "A. vinelandii", :]
18×3 DataFrame
 Row │ Bact           X                  Y                
     │ String         String             String           
─────┼────────────────────────────────────────────────────
   1 │ A. vinelandii  0.050561797752809  1314042485.29793
   2 │ A. vinelandii  0.252808988764045  1224431687.48933
   3 │ A. vinelandii  0.502808988764045  1038388615.87881
   4 │ A. vinelandii  0.932584269662921  657651643.874837
   5 │ A. vinelandii  1.19662921348315   403011067.968018
   6 │ A. vinelandii  1.89606741573034   147820324.502811
   7 │ A. vinelandii  2.21629213483146   100946201.037493
   8 │ A. vinelandii  2.82865168539326   80524951.2089802
   9 │ A. vinelandii  0.039325842696629  79770165.0793892
  10 │ A. vinelandii  0.235955056179775  99530230.5872667
  11 │ A. vinelandii  0.5                232302118.140757
  12 │ A. vinelandii  0.943820224719101  615695604.542686
  13 │ A. vinelandii  1.19662921348315   949522176.82357
  14 │ A. vinelandii  1.37921348314607   1236017272.81974
  15 │ A. vinelandii  1.69101123595506   1450621774.81555
  16 │ A. vinelandii  1.88202247191011   1471259098.32493
  17 │ A. vinelandii  2.1938202247191    1370926953.01064
  18 │ A. vinelandii  2.79213483146067   1464347573.09895

julia> df[df.Bact .== "A. vinelandii", "X"]
18-element Vector{String}:
 "0.050561797752809"
 "0.252808988764045"
 "0.502808988764045"
 "0.932584269662921"
 "1.19662921348315"
 "1.89606741573034"
 "2.21629213483146"
 "2.82865168539326"
 "0.039325842696629"
 "0.235955056179775"
 "0.5"
 "0.943820224719101"
 "1.19662921348315"
 "1.37921348314607"
 "1.69101123595506"
 "1.88202247191011"
 "2.1938202247191"
 "2.79213483146067"

julia> parse(Float64, df[df.Bact .== "A. vinelandii", "X"])
ERROR: MethodError: no method matching parse(::Type{Float64}, ::Vector{String})
Closest candidates are:
  parse(::Type{T}, ::AbstractString; kwargs...) where T<:Real at parse.jl:379
Stacktrace:
 [1] top-level scope
   @ none:1

Thank you

You can use a broadcasted version of parse, for example

df.X = parse.(Float64, df.X)
df.Y = parse.(Float64, df.Y)

which will apply x -> parse(Float64, x) to each element of the array. See the docs here for more background.

Another option would be to use the transform function, for example

df_parsed = transform(
    df,
    :X => ByRow(x -> parse(Float64, x)) => :X,
    :Y => ByRow(x -> parse(Float64, x)) => :Y,
)

Note that there is also a mutating version of transform:

 transform!(
    df,
    :X => ByRow(x -> parse(Float64, x)) => :X,
    :Y => ByRow(x -> parse(Float64, x)) => :Y,
)
1 Like

In DataFramesMeta you can do

@rtransform df :x = parse(Float64, :x)
1 Like