String7 type with read CSV?

Hello,

I am trying to import a single column file (very long more than 1 million lines). They are real number, so I thing float64 (numbers like 17, 139.4, 2345.87)

Here is the code

df=CSV.read(“QD.csv”, DataFrame; header = false)
rename!(df,[:Column1] .=> [:QUAL])

But now I just tried

first(df, 10)

and I get something strange

10×1 DataFrame

Row Column1
String7
1 21.54
2 33.09
3 33.09
4 2.66
5 1.43
6 0.77
7 2.93
8 1.59
9 1.34
10 1.3

What is that “String 7”? Why didn’t it recognise the numbers as float, and what is String 7?

And how would I change this to float64?
Thanks a lot

I am a newbie here, sorry if this is an obvious question

EDIT: I tried this

ArgumentError: cannot parse String7(“.”) as Float64

but got

ArgumentError: cannot parse String7(“.”) as Float64

So the problem is that (for some reason) in one of your rows, rather than a number, there’s a singular . with no digits. This isn’t convertible to a Float64.

1 Like

Ok, so the problem is the input file. Ok thanks a lot :slight_smile: I am gonna investigate this.

If the "." is meant to represent missing, you can tell that to CSV.read – I think the keyword argument is missingstrings or something close to that

Actually, there is no missing data, and I also checked for a line where I would only have a singuler .

but

awk ‘/^.$/’ QUAL.csv

returns nothing. Is it possible this is introduced when reading the file? I am really confused

Can the point not be really alone but have whitespace around it?

This is a possibility. The file is an extract from a very complicated kind of bioinformatics file format, and it’s very possible something weird / unexpected happened. But what matters here, is that I know the problem seems not my Julia code, but a problem upstream. Thank you :slight_smile: I will investigate.

EDIT: indeed, there is a whitespace somewhere that shouldn’t be there. Thanks a lot :slight_smile:

1 Like

For more information on String7, it is a type from InlineStrings.jl package which is used for working with strings where it helps with allocation and performance.

6 Likes

Thanks, bookmarked :slight_smile:

I would like to thank you all for your friendly welcome.

1 Like