Hello all,
I have always used csv.read for my csv files but now after the new julia update , I have troubles with csv.read
it sometimes reads numbers as integers and sometime doesnot
how do I avoid this problem ?
thanks
Hello all,
I have always used csv.read for my csv files but now after the new julia update , I have troubles with csv.read
it sometimes reads numbers as integers and sometime doesnot
how do I avoid this problem ?
thanks
You can specify types for the columns
types = Dict(
:columnA => String,
:columnB => Float64
)
data = DataFrame(CSV.File("data.csv"; types))
More info in the documentation:
https://csv.juliadata.org/stable/#Getting-Started
https://csv.juliadata.org/stable/#Providing-Types
it worked one time the this the error I got
Can you post an MWE of your error? The error message seems to suggest that you passed a keyword argument type_buses
which doesn’t exist.
type_buses= Dict(
:bus => Int16,
:type => String,
:bh => Float64,
:v => Float64,
:delta => Float64,
:pg => Float64,
:qg => Float64,
:pd => Float64,
:qd => Float64,
:pgmax => Float64,
:pgmin => Float64,
:qgmax => Float64,
:qgmin => Float64,
)
this is how I defined types_buses from my code and I also add it in the command of CSV.file
BUSES_DATAFRAME = DataFrame(CSV.File("D:/M A S T E R S S S S S S/chapter_4/imp/14bus/B14.csv"; type_buses))
types
is a keyword argument, not a positional argument, so you need to name it. Here’s a full MWE:
julia> using CSV, DataFrames
julia> df = DataFrame(rand(2, 3), :auto);
julia> CSV.write("out.csv", df);
julia> type_buses = Dict(:x1 => Float32, :x2 => String, :x3 => Float64);
julia> CSV.read("out.csv", DataFrame; types = type_buses)
2×3 DataFrame
Row │ x1 x2 x3
│ Float32 String Float64
─────┼──────────────────────────────────────────
1 │ 0.539608 0.050591523119873916 0.750995
2 │ 0.396338 0.48258399391743123 0.463351
Note that I write types = type_buses
to specify the kwarg. In the example above, things worked because the dictionary had the same name as the kwarg:
julia> types = type_buses;
julia> CSV.read("out.csv", DataFrame; types)
2×3 DataFrame
Row │ x1 x2 x3
│ Float32 String Float64
─────┼──────────────────────────────────────────
1 │ 0.539608 0.050591523119873916 0.750995
2 │ 0.396338 0.48258399391743123 0.463351
This fails if the name of the dict doesn’t match the kwarg name:
julia> CSV.read("out.csv", DataFrame; type_buses)
ERROR: MethodError: no method matching CSV.File(::CSV.Header{false, Parsers.Options{false, true, true, false, Missing, UInt8, Nothing}, Vector{UInt8}}; debug=false, typemap=Dict{Type, Type}(), type_buses=Dict{Symbol, DataType}(:x2 => String, :x3 => Float64, :x1 => Float32))
Closest candidates are:
CSV.File(::CSV.Header; finalizebuffer, startingbyteposition, endingbyteposition, limit, threaded, typemap, tasks, lines_to_check, maxwarnings, debug) at /home/nils/.julia/packages/CSV/la2cd/src/file.jl:221 got unsupported keyword argument "type_buses"
Which is the error you’re seeing.
I think what you’re seeing here is a side effect of the new named tuple auto expansion introduced in 1.5 (?), so this might not even work on older versions.
In any case I’d say it’s always best to be explicit with kwargs and spell them out - after all, that’s what they are for.
And more to your original question: I haven’t seen any regressions in CSV.jl’s capability of detecting number types automatically in recent releases, and if there are any, that might well be a bug.
Can you share the file for which an older version of CSV.jl successfully detected the correct type, but the latest version fails to do so?
I am now more confused …excuse me can you make it simpler
Sorry, that wasn’t the intention of course - the above is a self contained example, so you can execute it line for line and if there’s something you don’t understand feel free to check back in.
Alternatively, just change the function call you posted above:
DataFrame(CSV.File("D:/M A S T E R S S S S S S/chapter_4/imp/14bus/B14.csv"; type_buses))
to
DataFrame(CSV.File("D:/M A S T E R S S S S S S/chapter_4/imp/14bus/B14.csv"; types = type_buses))
and things should work.
I should also note that CSV.read(file, DataFrame)
is the same as DataFame(CSV.File(file))
just in case that added to your confusion!
Yes it worked…Thanks a ton…I have one problem remaining…Line 15 which is the last line in the csv file is not read properly. I find all values corresponding to that line as missing in the DF
This is very hard to debug without the file itself, although I would assume in this case that you are trying to force types with passing types = type_buses
, and the last line can’t be read in the format you are specifying (which is probably the reason why CSV.read
didn’t infer the types originally).
What happens if you just do:
CSV.read("D:/M A S T E R S S S S S S/chapter_4/imp/14bus/B14.csv", DataFrame)
when I do this, I find that all numbers are read as stringssss
my file has a header and numbers …nothing else Can I share part of it here
From this it seems there is no line 15? You have a header, and 14 rows of data, why are you expecting 15 rows in your DataFrame?
yes you are right…it seems like I became short sight after spending hours in front of the computer to solve this issue…Now its solved Thanks a million