I’m thinking of writing a package called “CàiSemiprimes” or “蔡Semiprimes” (Cài is a mathematician). Is it legal for a package name to have a non-ASCII character? To start with a character that does not have an upper/lowercase distinction?
à is in several keyboard layouts, and we program with characters such as ℯ, ∈, and ⊻, which aren’t on any keyboard layout that I know of, so we have to type \euler, \in, or \xor in the REPL, hit tab, and paste the result in the editor.
Is this documented? I don’t see it in 5. Creating Packages · Pkg.jl . What happens if I try to register a package with a non-ASCII character in the name?
Automatic merging is one thing but then there is also:
Are there any requirements for package names in the General registry?
There are no hard requirements, but it is highly recommended to follow the package naming guidelines.
Nevertheless, ASCII characters are definitely more user-friendly.
You are right. However, you can generally use the ASCII counterparts (in instead of ∈, xor(a, b) instead of a ⊻ b, …), so if you use the unicode characters in your code instead, it’s by choice. If you put a non-ASCII letter to your package’s name, then users have to type that letter if they’d like to use the package — and there is no ASCII counterpart.
FWIW, Cai does not seem too bad of an approximation of Cài Also, the character in front of “Semiprimes” in the second word does not even render for me.
There are no packages in General with non-ASCII names and I don’t think we would want to start now. Feel free to PR the ASCII requirement to Pkg’s docs.
Still: do not require Unicode in the public API of your package; you can use something like
function f(; eta=1.0, η=eta) end
if you want to allow for Unicode in your API, in addition to ASCII. Certainly, do not require it in your package name. Internally, you can go wild. But also: know your audience. What might be a useful addition to the heavily mathematically educated users of one package might severely limit the userbase of another. I’d strongly recommend keeping things in English, as the lingua franca of scientific computing. I would actually support having this as a requirement for the General registry, and then have separate “regional” registries for Spanish, Chinese, etc. It’s certainly worthwhile to also have code in one’s native language, especially for very young students, even if that means that code can’t have a global reach.
I think my main argument against non-plain-ASCII comes from an example I had the good old German ü but also with à itself. There are different ways to represent these. For example one can either directly use the ü ( “ü” U+00FC Latin Small Letter U with Diaeresis Unicode Character ) or the diacritic ( “¨” U+00A8 Diaeresis Unicode Character ) combined with u.
I once had the case with both ü and à where the single-letter was exchanged to the diacritic when copying files, but not when copying the string (within some code). Visually they are the same. This might lead to some unwanted confusion beyond the “where is that on my keyboard?” problem.
I think that’s because this is actually a hard requirement, it just wasn’t documented. So there wasn’t a need for a decision making process (it already is a requirement), just documentation.
This might lead to some unwanted confusion beyond the “where is that on my keyboard?” problem.
Yeah, IMO that is the main problem. We want the system to be very simple and unambiguous to prevent typosquatting, visual confusion attacks etc. We could probably have some very constrained set of unicode characters allowed but it would need to be fully thought-through from an ecosystem (and tooling) point of view and the pros would need to outweigh the cons. And since it adds to the complexity & maintenance burden of a volunteer-run system, there’s some fairly big cons, unfortunately.
Complexity and maintenance would also be my main reasons, in like helping new (and confused) new users.
Sure the one exception that could be made is of course the watch emoji, the Julia icon and the flame emoji – to have a nicer name for WatchJuliaBurn.jl
You can go wild in your personal registry. I’m fairly sure the registry tooling doesn’t even require valid Julia identifiers, although it would be utterly pointless to have a package that you can neither Pkg.add nor import.