I would like to announce the new package UnicodeREPL.jl which is a REPL enhancement for tab completing any Unicode codepoint to its Unicode symbol. This is useful if you want Unicode symbols that are not in the standard Julia tab completion and not on your keyboard.
Writing Unicode characters in this way inside a string is actually standard Julia syntax, so it works with or without UnicodeREPL.jl
The advantage of UnicodeREPL.jl is that you can tab complete so that you can more clearly see what you are writing before pressing return. UnicodeREPL.jl also works for Unicode characters that are not inside strings. I did write that one can use codepoints of any length, but I will consider including this example in future documentation to make this more clear.
Wow, Glyphy.jl looks like a very useful companion to UnicodeREPL.jl , it has actually gotten me thinking about making some kind of synergy between the two packages, it is certainly worth considering installing Glyphy if one uses UnicodeREPL.
On a related note, I have found some issues with how Julia deals with Unicode characters with code points above FFFF.
For example,
β\U1D6C1β returns βπβ: Unicode U+1D6C1 (category Sm: Symbol, math)
β\U1D6C1β returns βπβ
β\u1D6C1β returns ParseError: character literal contains multiple characters
β\u1D6C1β returns βᡬ1β (note that the code point for ᡬ is 1D6C)
Unicode BMP code points (\u with 1-4 trailing hex digits)
All Unicode code points (\U with 1-8 trailing hex digits; max value = 0010ffff)
Hex bytes (\x with 1-2 trailing hex digits)
Octal bytes (\ with 1-3 trailing octal digits)
\u is BMP, likely since once that was all there was so \U needed now to not be ambitious in some cases.
You actually hit such a case, why you got an error, but not for β\u1D6C1β because itβs a valid two letter string like β\u1D6Cβ + β1β. One reason to use the other syntax if you really after a Char, not a String (itβs also tiny bit faster).
It was a bot hard to look this up in the docs, and where should it be documented? I ended up looking up escape and then found String in a list. Note, for Strings Β· The Julia Language only \u is mentioned not \U, nor what they mean.
I actually misunderstood the C spec when I implemented this. In C \u requires exactly four hex digits after and \U requires six (I think?) or maybe it allows five or six? Anyway, ours allow up to four and up to six, which is kind of redundant. But yes, this is like C but a little more permissive.