New behaviour due to an update of the package CSV when using CSV.read

Hello,

When the file tmp.csv just contains the string “INFI_”, the code

using CSV
data = CSV.read("tmp.csv",header=false,delim=';');
data

gives the result

1×1 DataFrames.DataFrame
│ Row │ Column1 │
│ │ String │
├─────┼─────────┤
│ 1 │ INFI_ │

which is what I want.

BUT, when the file tmp.csv just contains the strings “INFI”, the code gives the result

1×1 DataFrames.DataFrame
Row │ Column1 │
│ Float64 │
├─────┼─────────┤
│ 1 │ Inf │

That is to say, there is a cast from a string to a float. Thus, INFI seems to be a reserved words. I have the same behaviour by replacing “INFI” with “INF”.

Is there a simple solution to avoid this behaviour ie to keep “INFI” as a string ? Thanks very much.

NB : Note that this new behaviour is obtained with the version CSV v0.5.9 of the CSV package.
With the version CSV v0.4.3 of the CSV package, I do not have this problem.

I would check if there is an existing issue for CSV.jl, and if not, open one. This looks like a bug.

Thanks very much, Tamas ! This issue seems to have similarities to the one described in New behaviour due to an update of the package CSV when using CSV.write

If there is no solution, is there another package to read a CSV file with Julia ? (except the one to come back to the CSV v0.4.3 version of the CSV package)

1 Like

There is

OK, thanks again ! By the way, perhaps a solution for my issue could be

CSV.read(file; types=[String])

PS : I cannot test this on my Linux OS because of the restricted access from my home (this is the weekend …).

Yes, you can always manually specify what the type of a column should be, so doing types=[String], or w/ a Dict by column id or number: types=Dict(1=>String).

What’s going on here is that the Parsers.jl package parses both INF and INFINITY as valid Float64 values, and it looks like it considers any prefix in-between as valid as well, so this can ultimately be fixed in Parsers.jl itself.

1 Like