Most modern languages support unicode identifiers these days:
- Python (2. Lexical analysis — Python 3.3.7 documentation)
- Rust (Identifiers - The Rust Reference)
- Swift (Documentation)
- Go (The Go Programming Language Specification - The Go Programming Language)
- C# (C# identifier names - rules and conventions - C# | Microsoft Learn)
- Raku (identifiers | Raku Documentation)
- Java (Charsets and Unicode Identifiers in Java - DZone)
- Ruby (Coding Ninjas Studio)
- C++ (Identifiers - cppreference.com)
- Heck, even C99 has rudimentary support… but as the oldest one here, it ironically allows
int \U03B1 = 2;
while leavingα = 2;
implementation defined. (Identifier - cppreference.com). - Javascript also allows unicode as well as using unicode escapes in identifiers somewhat similarly to C (Valid JavaScript variable names in ES5 · Mathias Bynens)
Of the languages I thought of here, only Perl, R, and Fortran don’t seem to support unicode. And only C and Javascript support using \U
or \u
escapes. None support latex- or html-like entity names.