I get this when trying to run
CSV.read("chuckle/random.csv", DataFrame)
But when running the same thing on another CSV in the same directory, it works fine.
I get this when trying to run
CSV.read("chuckle/random.csv", DataFrame)
But when running the same thing on another CSV in the same directory, it works fine.
Can you provide the full stack trace? Also could you report the output of
using Pkg
Pkg.status()
Can you try increasing the value of rows_to_check
?
So something like this?
CSV.read("chuckle/random.csv", DataFrame; rows_to_check = 100)
Try increasing it to 100 or 500 or 1000?
Stacktrace:
ERROR: MethodError: Cannot `convert` an object of type Missing to an object of type String
Closest candidates are:
convert(::Type{S}, ::CategoricalArrays.CategoricalValue) where S<:Union{AbstractChar, AbstractString, Number}
@ CategoricalArrays C:\Users\steve\.julia\packages\CategoricalArrays\0yLZN\src\value.jl:92
convert(::Type{String}, ::WeakRefStrings.WeakRefString)
@ WeakRefStrings C:\Users\steve\.julia\packages\WeakRefStrings\31nkb\src\WeakRefStrings.jl:81
convert(::Type{String}, ::FilePathsBase.AbstractPath)
@ FilePathsBase C:\Users\steve\.julia\packages\FilePathsBase\4RrDh\src\path.jl:117
...
Stacktrace:
[1] get!(default::Function, h::Dict{String, UInt32}, key0::Missing)
@ Base .\dict.jl:455
[2] checkpooled!(#unused#::Type{String}, pertaskcolumns::Nothing, col::CSV.Column, j::Int64, ntasks::Int64, nrows::Int64, ctx::CSV.Context)
@ CSV C:\Users\steve\.julia\packages\CSV\OnldF\src\file.jl:514
[3] CSV.File(ctx::CSV.Context, chunking::Bool)
@ CSV C:\Users\steve\.julia\packages\CSV\OnldF\src\file.jl:302
[4] File
@ C:\Users\steve\.julia\packages\CSV\OnldF\src\file.jl:227 [inlined]
[5] #File#32
@ C:\Users\steve\.julia\packages\CSV\OnldF\src\file.jl:223 [inlined]
[6] CSV.File(source::String)
@ CSV C:\Users\steve\.julia\packages\CSV\OnldF\src\file.jl:162
[7] read(source::String, sink::Type; copycols::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ CSV C:\Users\steve\.julia\packages\CSV\OnldF\src\CSV.jl:117
[8] read(source::String, sink::Type)
@ CSV C:\Users\steve\.julia\packages\CSV\OnldF\src\CSV.jl:113
[9] top-level scope
@ c:\julia\process.jl:61
Pkg.status
[336ed68f] CSV v0.10.11
[49dc2e85] Calculus v0.5.1
[a93c6f00] DataFrames v1.6.1
[1313f7d8] DataFramesMeta v0.14.1
[89b67f3b] ExcelFiles v1.0.0
⌅ [c04bee98] ExcelReaders v0.11.0
[38e38edf] GLM v1.9.0
[4076af6c] JuMP v1.16.0
[626c502c] Parquet v0.8.4
[91a5bcdd] Plots v1.39.0
[438e738f] PyCall v1.96.2
⌅ [df47a6cb] RData v0.8.3
[ce6b1742] RDatasets v0.7.7
[9d95f2ec] TypedTables v1.4.3
⌅ [fdbf4ff8] XLSX v0.7.10
Same Error and Stacktrace as in my reply to @mkitti
How many lines are in the chuckle/random.csv
file?
That is, what is the output of wc -l chuck/random.csv
?
Stacktrace points to
I just realized that OP is on Windows.
For Windows, this should give you the number of lines in the file. Run the following in PowerShell:
(Get-Content random.csv).Length
└─$ wc -l c:/data/chuckle/random.csv
2326 c:/data/chuckle/random.csv
Can you try this?
CSV.read("chuckle/random.csv", DataFrame; rows_to_check = 2326)
Same thing
How about this?
CSV.File("chuckle/random.csv"; rows_to_check = 2326)
Also, is it possible for you to share the CSV file?
can’t share it due to its contents
Can you try doing the following?
for i in 1:2326
@info "Attempting" i
CSV.File("chuckle/random.csv"; limit = i, ntasks = 1)
@info "Finished" i
end
For clarity, the referenced error seems to indicate that a request was made to convert a Missing value into a string. It seems like the CSV.jl parser might be having some trouble with something in this .csv
file that lead to a ::Missing
value somewhere unexpected.
I’ll have to keep that code handy. It helped find what the error was. It stopped on a line where one of the fields was like this:
...","text "text" text","....
Good to hear you found the problem.
I wonder if this is worth filing an Issue on the CSV.jl repo? This seems like a relatively unhelpful error message, indicating the direct internal cause of the error (can’t convert missing to string) rather than the more pertinent error: malformed input at line n of the input file.
I’ll post there also. And yes, there were no Missing values in the file at all.