Do you really think everyone knows what is unicode?
I donāt really get what point youāre trying to make here. Like, is your point that nobody finds unicode to be an API accessibility issue in practice, or what?
I was just teasing at the irony of the chicken-egg problemāI do appreciate the ability to use unicode in my scripts, although maybe not for public-facing apis
Firstly, this is not general knowledge. Secondly, at a given moment you may know you need a Greek letter, but not have one available to copy-paste. And thirdly, it only works if your editor supports latex-to-unicode commands.
And fourthly, itās just more work.
My point is that an ascii-only campaign comes at a cost and that having a realistic model of oneās audience is key in making decisions about readability.
If there are lots of people using Julia without the ability to render Unicode, for example, Iād like to know about them and their circumstances; if there arenāt, we shouldnāt invoke costs to such people in our arguments.
My questions would be
What platform, editor, educational background, eyesight, etc do my users have? How do those affect their ability to use characters fluently? Without knowing these things, appeals to those usersā needs are on shaky ground.
Most Julia users are in VSCode, which edits unicode. Which of the other editors canāt?
Finally, how much of this discussion is simply a personal aesthetic preference or even just familiarity? Is this just like 1-based indexing? If we encourage users to do unicode, will they not come to like it more? If we use it more, wonāt it become less of a problem as knowledge of how to use it diffuses?
Thanks for clarifying.
Well, as I said, I donāt have strong feelings about it one way or another. I certainly prefer unicode, but there also are enough people out there in the ecosystem who have repeatedly voiced their concerns and opposition to unicode-only APIs that I think itās not such a big deal to accommodate it. Especially if we come up with nice patterns to make it easy to do so, like the one @moble showed.
I can definitely say that thereās been times where Iāve SSHād into a running server from my phone where unicode input is more annoying than usual, and every time Iāve done that Iāve been midly thankful for ASCII APIs.
I canāt speak to dyslexia or low vision. But I donāt see how using rho
instead of Ļ
(or my example of Lambda_1
instead of Īā
) will make any difference at all to people with lower reading levels or abilities or learning difficulties. As for non-native English speakers, I work with them all the time, and they are at least as capable of discerning Unicode as I am ā usually more so. As a person with limited time and attention span, I personally find Unicode that looks like the math symbols I am already familiar with much faster and less fatiguing to process than ASCII.
As a sighted user, I find ASCII translations of Unicode much ānoisierā than Unicode.
Maybe weāre talking past each other. When I think about Unicode in Julia, Iām almost exclusively thinking about representing mathematical symbols that are already used in the literature that inspired us to write the code in the first place. Understanding the code largely depends on understanding those concepts from the literature first, which usually involves an ability to understand the symbols.
Is this also the sort of usage that you envision and find objectionable? Is there some broader class of uses that people find objectionable?
There are some examples of symbols in this thread that I find much less clear than the ASCII transcription of them. To a degree that may be a function of the screen resolution that is available for the display of the character, of course. But, when sticking to ASCII, one may be sure that āwhat you see is what you getā. Not so with many Unicode characters.
Since you asked for an example: a'
vs aā
vs a
ā², these are not the same thing. Only one of them is a transpose.
Thatās a fair point. I am sometimes surprised by what shows up on github. And even using the excellent JuliaMono, some character combinations appear incorrectly in VS Code, Emacs, and Terminal (which seems to be the fault of those programs, rather than JuliaMono).
Iāll also admit to a little concern when using things like aā²
. But context is important, and I only use \prime
when itās already used in the literature.
Hereās a little anecdata: my project is 3,597 lines of code (in src
, plus another 1,730 in docs
+test
), of which ~30 particularly easy-to-write lines are devoted to this API translation. It doesnāt feel like much cost to me.
I suspect that it is entirely aesthetic for a lot of users, but I still want them to use my code.
Note that a Unicode API wonāt just āencourageā, it will require users to use Unicode. That might be enough to discourage the particularly lazy Julia user, but also make it literally impossible to use even slightly fancy Unicode from python.
- Unicode is fine within code where it increases legibility, but in no case should Unicode be used in public APIs. This is to allow support for terminals which cannot use Unicode: if a keyword argument must be Ī·, then it can be exclusionary to uses on clusters which do not support Unicode inputs.
Thatās the reasoning for disallowing it in SciMLStyle.
Do you happen to have a reference to these clusters? I havenāt seen that restriction myself.
Lots of older clusters like the XSEDE ones had this restriction. It can greatly dependent on the terminal that is used too, where newer terminal versions support it on āstandardā hardware, but as you get to other hardware or more legacy systems you tend to have more issues with unicode.
I like using Unicode in my code. What I hate is that many superscripts/subscripts are incomplete. Like subscript f
.
It seems like there is a setting to support UTF-8 input on the specific terminal mentioned in another thread.
Okay sure, but I cannot just assume that everyone that will ever use Julia has read that post. So therefore, it will not go into public APIs because itās not inclusive. No matter what some hardcore person says, unicode in keyword arguments is an easy way to get 30 confused emails a month back in 2018, before we even had big adoption. Thatās why itās disallowed, and I am sure that with the reach we have now it would be hundreds of people having issues using the software because of one small difference in a naming choice. Itās just not worth it for software that has a wide reach.
How do you feel about the earlier post showing how to allow both character sets for keyword arguments, by setting the unicode argument equal to the non-unicode one by default, and only using the unicode one internally? Should that be disallowed in the style guide?
In general, reducing API surface just makes things easier to maintain. I would just prefer a single keyword argument for that reason. We already have way too many kwargs, I donāt want more
Personally, Iāve found much less use for unicode in APIs than for internal temporary variables, or function input arguments. So it is less a matter of self-restraint or consideration for people on exotic terminals, than it is of personal preference.
Non-ascii unicode symbols are really useful for variables, especially when they are part of complex mathematical expressions. But an API consists of function names, type names and keyword arguments, and I donāt really see how unicode symbols are that appropriate for those.
While a variable is just a placeholder for a value (and can often be abstract or defy concrete naming), API functions, on the other hand, almost always describe a limited number of well thought-out, concrete actions with obvious names, even in cases where there exist popular symbolic representations of them in the scientific literature.
Keyword arguments are something that you would use carefully, and for cases where you want to be extra clear about intent. Keywords are, well, words, in my mind. If your keyword is a unicode symbol, perhaps it should actually be a positional argument instead?
As for types, Iāve just never thought about giving them unicode names. It is conceivable that it could be useful, but is this common?