My work around is to avoid using Header as follow:
# Read data
Data = DelimitedFiles.readdlm("Smap_Id_Select.csv", ',')
# Read header
Header = Data[1,1:end]
# Remove first row
Data = Data[1:end.≠1,1:end]
Such rounding might be happening at more places than the first row, so removing the first row is not the solution. If you have data of mixed types then it’s best to read it in by specifying the element type to be Any.
The issue with specifying header=true but no type is that by default, type is assumed to be Float64 and you get rounding errors on column 1 data.
In your example this can be fixed by specifying header=true and type Int for data.
But …
In this case I have only Int64, but what happens if Id is an integer and the data is FLoat64?
Well, here you can specify header=true with type Any.
Or use CSV, which can do a better job of inferring column types, or you can explicitly specify them.
Your solution of reading without header=true and manually slicing out the first row works because the type is automatically inferred as Any since the first row (header) contains strings and the data rows are numeric.
BTW, I think you can simplify your slicing:
# Read header
Header = Data[1,1:end]
# Remove first row
Data = Data[1:end.≠1,1:end]
Header = Data[1, :]
Data = Data[2:end, :]
What is the fastest way to read .csv => Array and not DataFrames?
I would like to thank you for providing us such a DelimtedFiles which is in the core Julia package and therefore I expect that the DelimtedFiles works as expected with no surprises.
My comment is that if DelimitedFiles is outdated it may be a good idea to replace it with CSV.jl. I understand that there must be some build tools to easily convert DataFrameworks into Array.