That’s the Latin-1 encoding of "café", not Unicode. You need to convert this to the UTF-8 encoding of Unicode as used by Julia. I’m guessing that you are on Windows and that this is actually Windows-1252, since Latin-1 is not common anymore elsewhere.
Two options:
-
Read it into Julia as bytes (
Vector{UInt8}viaread(io)) and convert the encoding with StringEncodings.jl or some similar package. e.g.decode(Vector{UInt8}("caf\xe9"), "Windows-1252")gives"café". -
Change your files to use UTF-8. Windows-1252 is an archaic encoding that can only encode 256 characters, nowhere near all of Unicode. People should really stop using it. See here for various tools. (e.g. For a single file, you can just open it in Notepad or some other editor and choose “UTF-8” when you save, but there are also batch tools to re-encode many files at once.)
(In any case, this not not technically about “parsing”, which is a distinct concept from “encoding”.)