And that is very complex, compared to simply having the code point stored as its numerical value.
As I showed elsewhere, the difference in code generated to pack and unpack that UTF-8 based format is very large (it’s a noop in v0.6 and earlier versions of Julia to go between Char
and UInt32
)
It also means that you can’t share a Vector{Char}
any more with a C/C++ array of wchar_t
(when it is 32-bit) or char32_t
.