Csv error reading numbers as string

mayar · December 6, 2020, 12:16pm

Hello all,

I have always used csv.read for my csv files but now after the new julia update , I have troubles with csv.read
it sometimes reads numbers as integers and sometime doesnot
how do I avoid this problem ?

thanks

martincornejo · December 6, 2020, 1:04pm

You can specify types for the columns

types = Dict(
    :columnA => String,
    :columnB => Float64
)

data = DataFrame(CSV.File("data.csv"; types))

mayar · December 6, 2020, 2:17pm

mayar · December 6, 2020, 2:17pm

it worked one time the this the error I got

nilshg · December 6, 2020, 2:20pm

Can you post an MWE of your error? The error message seems to suggest that you passed a keyword argument type_buses which doesn’t exist.

mayar · December 6, 2020, 2:25pm

type_buses= Dict(
    :bus => Int16,
    :type => String,
    :bh => Float64,
    :v => Float64,
    :delta => Float64,
    :pg => Float64,
    :qg => Float64,
    :pd => Float64,
    :qd => Float64,
    :pgmax => Float64,
    :pgmin => Float64,
    :qgmax => Float64,
    :qgmin => Float64,
)

mayar · December 6, 2020, 2:26pm

this is how I defined types_buses from my code and I also add it in the command of CSV.file

BUSES_DATAFRAME = DataFrame(CSV.File("D:/M A S T E R S S S S S S/chapter_4/imp/14bus/B14.csv"; type_buses))

nilshg · December 6, 2020, 2:38pm

types is a keyword argument, not a positional argument, so you need to name it. Here’s a full MWE:

julia> using CSV, DataFrames

julia> df = DataFrame(rand(2, 3), :auto);

julia> CSV.write("out.csv", df);

julia> type_buses = Dict(:x1 => Float32, :x2 => String, :x3 => Float64);

julia> CSV.read("out.csv", DataFrame; types = type_buses)
2×3 DataFrame
 Row │ x1        x2                    x3       
     │ Float32   String                Float64  
─────┼──────────────────────────────────────────
   1 │ 0.539608  0.050591523119873916  0.750995
   2 │ 0.396338  0.48258399391743123   0.463351

Note that I write types = type_buses to specify the kwarg. In the example above, things worked because the dictionary had the same name as the kwarg:

julia> types = type_buses;

julia> CSV.read("out.csv", DataFrame; types)
2×3 DataFrame
 Row │ x1        x2                    x3       
     │ Float32   String                Float64  
─────┼──────────────────────────────────────────
   1 │ 0.539608  0.050591523119873916  0.750995
   2 │ 0.396338  0.48258399391743123   0.463351

This fails if the name of the dict doesn’t match the kwarg name:

julia> CSV.read("out.csv", DataFrame; type_buses)
ERROR: MethodError: no method matching CSV.File(::CSV.Header{false, Parsers.Options{false, true, true, false, Missing, UInt8, Nothing}, Vector{UInt8}}; debug=false, typemap=Dict{Type, Type}(), type_buses=Dict{Symbol, DataType}(:x2 => String, :x3 => Float64, :x1 => Float32))
Closest candidates are:
  CSV.File(::CSV.Header; finalizebuffer, startingbyteposition, endingbyteposition, limit, threaded, typemap, tasks, lines_to_check, maxwarnings, debug) at /home/nils/.julia/packages/CSV/la2cd/src/file.jl:221 got unsupported keyword argument "type_buses"

Which is the error you’re seeing.

I think what you’re seeing here is a side effect of the new named tuple auto expansion introduced in 1.5 (?), so this might not even work on older versions.

In any case I’d say it’s always best to be explicit with kwargs and spell them out - after all, that’s what they are for.

nilshg · December 6, 2020, 2:41pm

And more to your original question: I haven’t seen any regressions in CSV.jl’s capability of detecting number types automatically in recent releases, and if there are any, that might well be a bug.

Can you share the file for which an older version of CSV.jl successfully detected the correct type, but the latest version fails to do so?

mayar · December 6, 2020, 2:55pm

I am now more confused …excuse me can you make it simpler

nilshg · December 6, 2020, 2:58pm

Sorry, that wasn’t the intention of course - the above is a self contained example, so you can execute it line for line and if there’s something you don’t understand feel free to check back in.

Alternatively, just change the function call you posted above:

DataFrame(CSV.File("D:/M A S T E R S S S S S S/chapter_4/imp/14bus/B14.csv"; type_buses))

to

DataFrame(CSV.File("D:/M A S T E R S S S S S S/chapter_4/imp/14bus/B14.csv"; types = type_buses))

and things should work.

I should also note that CSV.read(file, DataFrame) is the same as DataFame(CSV.File(file)) just in case that added to your confusion!

mayar · December 6, 2020, 3:03pm

Yes it worked…Thanks a ton…I have one problem remaining…Line 15 which is the last line in the csv file is not read properly. I find all values corresponding to that line as missing in the DF

nilshg · December 6, 2020, 3:06pm

This is very hard to debug without the file itself, although I would assume in this case that you are trying to force types with passing types = type_buses, and the last line can’t be read in the format you are specifying (which is probably the reason why CSV.read didn’t infer the types originally).

What happens if you just do:

CSV.read("D:/M A S T E R S S S S S S/chapter_4/imp/14bus/B14.csv", DataFrame)

mayar · December 6, 2020, 3:08pm

when I do this, I find that all numbers are read as stringssss

mayar · December 6, 2020, 3:09pm

my file has a header and numbers …nothing else Can I share part of it here

nilshg · December 6, 2020, 3:13pm

From this it seems there is no line 15? You have a header, and 14 rows of data, why are you expecting 15 rows in your DataFrame?

mayar · December 6, 2020, 3:16pm

yes you are right…it seems like I became short sight after spending hours in front of the computer to solve this issue…Now its solved Thanks a million

Topic		Replies	Views
String7 type with read CSV? New to Julia	8	360	June 23, 2023
Specifying column type efficiently in CSV.read for large datasets General Usage	4	616	June 22, 2020
Is there a way to read a DataFrame from file specifying the type of each column? New to Julia question	7	110	November 1, 2024
CSV.read not recognizing "select" keyword Data csv , ijulia	11	883	June 6, 2022
CSV.read Error - provide a valid sink argument General Usage csv	11	10468	February 22, 2021

Csv error reading numbers as string

Getting started

Providing-Types

Typemap

Related topics