You can use Unicode.graphemes
to iterate over graphemes (“user-perceived characters” in unicode), regardless of how they are encoded in code points:
julia> using Unicode
julia> graphemes("Héllo World")
length-11 GraphemeIterator{String} for "Héllo World"
julia> graphemes("Héllo World") |> collect
11-element Array{SubString{String},1}:
"H"
"é"
"l"
"l"
"o"
" "
"W"
"o"
"r"
"l"
"d"
Note that the second element of this array is a string of 2 code points (“2 characters” in the terminology of Julia docs)