CSV, white spaces and data type. Force Float on empty fields

Yes, when I removed the missingstring=" " option then the tryparse worked. With the missingstring on, it errors

julia> tryparse.(Float64,isc[!, :smajax])
ERROR: MethodError: no method matching tryparse(::Type{Float64}, ::Missing)
Closest candidates are:
  tryparse(::Type{Float64}, ::String) at parse.jl:247
  tryparse(::Type{Float64}, ::SubString{String}) at parse.jl:252
  tryparse(::Type{T}, ::AbstractString) where T<:Union{Float32, Float64} at parse.jl:287

Yes, it’s the explanation I gave abote. passmissing(tryparse(Float64, missing)) returns missing instead of nothing.

I think you mean passmissing(tryparse)(Float64, missing) - we’re not having a lot of luck with correctly typing this out today :smiley:

FWIW I downloaded your data and am getting all String columns (irrespective of any missing values - it seems to me that missingstrings = [" "^i for i in 1:8] catches them all), so not sure how you’re managing to get any Floats :man_shrugging:

There are a few other discussions I found on here around fixed width files which suggest CSV.jl should be able to handle them with ignorerepeated = true, but that doesn’t help in my tests either. Maybe there should be a trim_whitespace kwarg like in readr’s CSV parser.

Sorry had to leave for a Zoom.
I’m Windows. I don’t see why we would have different column types with the same fime and same commands. Ah, I’m also on “v1.6dev”

Edit: nope, tried with 1.5 and still get Floats and Strings

julia> isc = CSV.File("isc-gem-cat.csv", header=93, normalizenames=true, dateformat=Dates.DateFormat("yyyy-mm-dd HH:MM:SS.sss")) |> DataFrame
39160×31 DataFrame
   Row │ _date                   lat      lon       smajax    sminax    strike   q       depth    unc      q_1     mw       unc_1    q_2     s       mo       fac     mo ⋯
       │ DateTime                Float64  Float64   String    String    String   String  Float64  Float64  String  Float64  Float64  String  String  String   String  St ⋯
───────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
     1 │ 1904-04-04T10:02:34.56   41.802    23.108      8.6       6.6    164.2    B         15.0      4.8   C         6.84     0.4    C       d        2.30    19      b ⋯
     2 │ 1904-04-04T10:26:00.88   41.758    23.249      8.3       6.9     15.2    B         15.0      4.8   C         7.02     0.4    C       d        4.20    19      b
     3 │ 1904-06-25T14:45:39.14   51.424   161.638     33.6      18.7    116.2    C         15.0     25.0   C         7.5      0.4    C       d        2.60    20      b