The first component of the names matrix should be "t", rather than "\ufefft". I understand \ufeff is an indicator of U+FEFF encoding. How can I pass that to readdlm or parse it to eliminate that indicator in the resulting names matrix?
0xFEFF encodes UTF-16 big-endian data in the file.
I wonder why you get the strings as you get them, so my guess is, that your input file is wrong by having the UTF-16 BOM (byte order mark) but actual data isn’t.
But perhaps I am wrong and the following solution works for you:
Well, I don’t know, what your data1.csv is, but 0xEFBB is the starting BOM of UTF-8 (EF BB BF), it seems your data1.csv has somehow changed in the meanwhile.
Can you open your original data1.csv with an HexEditor and report the first few bytes?
I am expecting:
FE FF 74 2C 63 2C 68 68 6F …
from your original post and which would be a wrong file encoding. The right one with FE FF woud be:
FE FF 00 74 00 2C 00 63 00 2C 00 68 00 68 00 6F
I see, its a UTF-8 BOM file.
With StringEncodings I get an error:
julia> readdlm(open(read, "new.txt", enc"UTF-8 BOM"),',')
ERROR: Conversion from UTF-8 BOM to UTF-8 not supported by iconv implementation, check that specified encodings are correct
Would CSV.jl and DataFrames a viable way for you?
using DataFrames, CSV
data=DataFrame(CSV.File("data1.csv"))
That could work. Do you know if there’s a way to extract the data from the DataFrame into an Array without forming manually? I guess in the worst case I could do: