Space after quoted field breaks delimited file parsing


DelimitedFiles can not parse a file that has an extra space after a quoted field:

txt = "a,\"b\"  " # note the extra space at the end there
file = "tmp.csv"
open(file, "w") do io
    print(io, txt)
using DelimitedFiles
x = readdlm(file, ',') # ERROR: unexpected character ' ' after quoted field at row 1 column 2

But CSV manages alright:

julia> using CSV, Tables

julia> x = CSV.File(file, header = ["h1", "h2"]) |> Tables.columntable
(h1 = Union{Missing, String}["a"], h2 = Union{Missing, String}["b"])

Since CSV is not a standardized format (despite some efforts), “extra” whitespace may be considered invalid by some parsers. You can also think of this as a feature (validation).

The usual adage applies: if you need quoted fields, you shouldn’t use CSV. :wink:


Yup, makes sense. I’ve noticed this before: DelimitedFiles is robust and light, CSV has more features.