The Hegemony Of `==` Must End; Welcome The `⩵` Upstart

blackeneth · June 26, 2021, 2:40pm

A lot of discussion took place with My Brain Is Hard-Wired Against ==; Help Me, Julia, I thought I would post my response as one monolithic post.

The Proposal

Create a Unicode character alias for the relational operator ==

Possible Unicode Characters to use:
I present to you three possibilities:
U+0225F ≟ (question equals) \questeq
U+0229C ⊜ (circled equals) \circledequal
U+02A75 ⩵ (two consecutive equal signs) \Equal or \Equ

I would think the leading candidate would be U+02A75, as it looks the same as == and therefore has as similar visual appearance in code:

if x == 5 
if x ⩵ 5

Typing shortcuts
The above all have typing shortcuts. Desire on that is short and easy to type (“easy to type” means: short, lowercase, and easy keys to reach for touch typists).

The “leading candidate,” ⩵, already has the shortcut that is not too bad \Equ.

However, there is an opportunity for an even better shortcut. Recall that ≠ has shortcut \ne. The shortcut “\ie” (“is equal”) is currently not used. We could have a nice correspondence between \ne (not equal, ≠) and \ie (equality, ⩵).

Bottom Line - What is Your Ideal Outcome

Implement U+02A75 ⩵ as a Unicode alias for the == equality operator
Create keyboard shortcut \ie to enter it.

What The Proposal Does Not Do

Change how the == relational operator works
Change how the relation is evaluated

Why Do this?
In the end, to benefit the user by helping the user avoid the mistake of using the assignment operator (=) for the relational operator (==)

Julia is uniquely positioned to mitigate this problem, which has existed since 1957.

Would People Use it?
People already use \ne \leq \geq and find benefit from it. A similar benefit is to be had from \Equ (or \ieq or \ie)

What Are the Justifications?

ONE - From Design

Computer programming languages, ideally, follow good design principles.
The design principles are minimality, referential transparency, orthogonality, correspondence, abstraction, pattern-based design, meta-model design.
Here is another: complete, balanced sets.

Julia currently has these standard comparison operators:

Operator	Alias	Name
==		equality
!=	≠	inequality
<		less than
<=	≤	less than or equal to
>		greater than
>=	≥	greater than or equal to
===	≣	equivalence

Note how all two- and three-character operators have a single character Unicode alias – except for equality.

Adding ⩵ (“leading candidate”) for == would complete the set.

TWO - From Human Cognition
Computer programming languages are also ideally, compatible with how human’s think. Part of that is designing against common errors at human’s make.

Some computer languages are cognitively hard, either intentionally, or otherwise

Having an Unicode alias for == would help make Julia better for human cognition and help users avoid the mistake of using the assignment operator for the relational operator.

Human errors can be classified as “mistakes” or “slips”. What distinguishes them is the intention of the person. An error in the intention is called a mistake. An error in carrying out the intention is called a slip.

The error we are discussing is a “slip” – the user intended to type the relational operator (==), but instead typed the assignment operator (=). Furthermore, this is a skill-based error, as the user knows the language syntax and is skilled at typing.

So what causes the slip? Slips can be categorized based on their presumed sources. The slip of using = for == is could be one the following, or a combination of these:

A blend slip, in which combination of components from two competing schemas (where a “schema” is an organized memory unit)
A mode error, an erroneous classification of the situation. Users move among many different environments, where “=” has different meanings: mathematics, and some programming languages use “=” as a relational operator.
Associative activation, as the currently active schema activate others which they are associated.

Adding to the probability of making this slip is high familiarity with the task and low attention to it – our being on “auto pilot”. Skilled touch typists don’t have to think about how to move their fingers to generate letters and words; thought gets translated to action without cognitive effort. The user’s cognitive effort is focused on solving the problem for which they’re writing a program. Furthermore, actions done on “auto-pilot” are more subject to corruption from interruptions.

sources:
“Design Rules Based on Analysis of Human Error,” by Donald A. Norman, Communications of the ACM, April 1983, Vol. 26, No. 4
“A comprehension-based model of correct performance and errors in skilled, display-based, human-computer interaction,” by Muneo Katajima and Peter G. Polson, Int. J. of Human-Computer Studies (1995) 43, 65-99
“Categorization of Action Slips,” by Donald A. Norman, Psychological Review, Jan. 1981, Vol. 88, No. 1

THREE - From Data

This is a common error. How do we know?

A) Linters and compilers are programmed to generate a warning for it.

The error in question is syntactically correct. Why would a linter or a compiler generate a warning? Because it’s a common error.

The VSCode Linter for Julia highlights it.
gcc has command line switch to warn on it (-Wall or -Wparentheses)
SAS JMP JSL will pop up a dialog upon script execution to ask the user if they really intended to do assignment in the if (answer: always no)

Other erroneous, but syntactically correct code is not highlighted, such as:

if a==5 && a==6 
    println("a is both 5 and 6")
end 


if 1 & 0 > 9
    println("Happy day")
end 

if findfirst(@assert) chop(@macro) diagonal()
    println("nonsense passes lint")
end

B) the development of Yoda conditions to guard against this error.

C) Many lists of “common programming errors” and other discussions include using the assignment operator for the relational operator:

Top 5 Most Common Silly C++ Mistakes – see comments
The 10 Most Common C++ Mistakes
Which common mistakes do beginner C++ programmers often commit?
Confusion with assignment operators
Assignment versus equality
Some Common C Programming Bugs
8 Common Programming Mistakes
Common Beginner C++ Programming Mistakes
Exposing The Most Frequent Mistakes In Programming

Testimony
@D_A
I understand you very well and I am glad to see that I am not alone…

Personally, as I often work with SQL, XSL (and XPath), I regularly make this mistake too by putting = instead of == for equality, and as the ≠ symbol is shorter and cleaner (I never use !=), I tend to do my tests reversed with ≠ instead of direct ==…

Objections

Objection: You can do it yourself

@jzr

julia> a ≟ b = a == b
≟ (generic function with 1 method)
julia> 4 ≟ 4
true

I replied: that makes you the only creepy dude in the world who does it, and you get puzzled questions from library maintainers.

@Tamas_Papp replied
Nay, that’s fine, others can just look up the function in your code, the tooling is there, like for any other function.

If you want an alias, just do it as @jzr suggested, but it is unlikely that you can make others change the way they code, or that the language should support you in this attempt.

@sijo replied:
I disagree, I think it’s generally a bad idea to introduce non-standard syntax without a strong reason. “Others can just look up…” may be right but missing the larger picture, which is that even small non-standard things add up. This can make codebases significantly less accessible to new readers/contributors. Even one idiosyncrasy is an unnecessary obstacle that can be annoying to people who jump through many codebases in a single day.

Response: redefining it yourself is code obfuscation. If I contribute that code to a library, I haven’t just done it for myself, but for you too. What it someone takes my code and wants to modify it? Do they follow my non-standard syntax or the standard syntax?

Defining your own syntax violates the program design principles of simple, reliable, and adaptable. Namely, not simple and not adaptable.

Objection: it would break code already using the chosen symbol

my assertion
it doesn’t break anything

@Tamass_Papp
Technically it would break code already using these symbols as a function name or something.
Also, not using up a lot of the Unicode operator selection with various defaults was a very sane decision for Julia, since that allows users to make use of them.

Response: True enough, but that’s also true for the Unicode aliases released in 1.7 beta:

⫪ (U+2AEA, \Top, \downvDash) and ⫫ (U+2AEB, \Bot, \upvDash, \indep) may now be used as binary operators with comparison precedence (#39403).
The middle dot · (\cdotp U+00b7) and the Greek interpunct · (U+0387) are now treated as equivalent to the dot operator ⋅ (\cdot U+22c5) (#25157).
The minus sign − (\minus U+2212) is now treated as equivalent to the hyphen-minus sign - (U+002d) (#40948).

Or in 1.6:

ꜛ (U+A71B), ꜜ (U+A71C) and ꜝ (U+A71D) can now also be used as operator suffixes. They can be tab-completed from ^uparrow, ^downarrow and ^! in the REPL (#37542).

I can go on. Hey, why not one more:

In 1.5:

⨟ is now parsed as a binary operator with times precedence. It can be entered in the REPL with \bbsemi followed by TAB (#34722).

The Unicode standard has 143,859 characters. We aren’t significantly depleting that resource if we use one more for an alias for ==.

Objection: would this really help the cognitive problem?

@yha asks
I don’t see how an alias for == would help with your original problem of mistakenly using = when you mean ==. What’s the connection?

Response:
It helps in that it replaces using compound symbol – of two equal signs – which have a strong association with “equality” with \ie (or \Equ or \ieq).
One won’t mistake \ie for = .
I use \ne, \leq, and \geq all the time, even though it requires me to type more characters (including the annoying \ one). Also, it slows you down, so you think about it more (not on “auto pilot”).

Objection: It’s just one of many common errors
@oheil replied:
There are a lot of common errors. E.g. forgetting ; at the end of a line, forgetting a closing " at the end of a string, using ' instead of " for strings, just to name a few which come to mind for this and other languages.

Response: Not quite the same.

Forgetting to use “;”, or using “;” when you don’t need to is a slip with a mode cause.
Using " or ’ for strings, when you should use the other, is a slip with mode cause.
Whereas using = for == combines aspects of blend, mode, and association activation.

A (hypothetical) analogous situation could arise for booleans:
∧ is the symbol for AND, and ∨ is the symbol for OR.
In our (hypothetical) language, however, ∧ is used for exponentiation.
So when you want to use AND in a boolean expression, you have to type ∧∧

Objection: the VSCode linter already highlights it

Response: this is not a very effective safety for catching this problem. The VSCode interface is an angry fruit salad and the linter is a fireworks show on top of it.

What do I mean by that? While typing, the linter is always highlighting stuff, popping up boxes for completions, or definitions, or underlining stuff, or changing the colors. You type in a bit of code, and of course – since you haven’t completed it yet – it breaks the downstream syntax, so the linter highlights and recolors it, only to revert back once you’re done typing. Or you type in “#=” to start a block comment, and all the text afterwards changes color the “comment color”, only to revert again when you put the “=#” in. In short, the linter cries wolf too often. This leads humans to ignore it. Ironically, it might be more effective if it were slower.

Objection: just use isequal()

Response: not a bad idea, although a little troubling it doesn’t act exactly like ==. So not a direct replacement.

Finally:
@Tamas_Papp
I am not sure that there is anything else that can be done here, short of consulting a brain specialist.

Response:
That would have to be a team of brain specialists, ideally at a research university.

Sukera · June 26, 2021, 2:42pm

blackeneth · June 26, 2021, 2:43pm

1.7 beta:

⫪ (U+2AEA, \Top, \downvDash) and ⫫ (U+2AEB, \Bot, \upvDash, \indep) may now be used as binary operators with comparison precedence (#39403).
The middle dot · (\cdotp U+00b7) and the Greek interpunct · (U+0387) are now treated as equivalent to the dot operator ⋅ (\cdot U+22c5) (#25157).
The minus sign − (\minus U+2212) is now treated as equivalent to the hyphen-minus sign - (U+002d) (#40948).

1.6:

ꜛ (U+A71B), ꜜ (U+A71C) and ꜝ (U+A71D) can now also be used as operator suffixes. They can be tab-completed from ^uparrow, ^downarrow and ^! in the REPL (#37542).

1.5:

⨟ is now parsed as a binary operator with times precedence. It can be entered in the REPL with \bbsemi followed by TAB (#34722).

Sukera · June 26, 2021, 2:51pm

Aside from 2. (middle dot) and 3. (minus sign), all of these did not change anything about the core language and made those symbols available for user code to use as people see fit in their packages.

If you take a look at the respective issues/PRs on github, you’ll notice that they’re either old (e.g. the middle dot one stems from 0.7, when the big push for stabilizing syntax happened and it seems like it just wasn’t merged back then) or because the asked for syntax is seen as undisputably equivalent (and allows copying from LaTeX pdfs, apparently).

Note also that before any new version is released, the new version is tested against all released & in General registered packages, to make sure it doesn’t break anything.

Further, I’m not sure I understand how remembering to write \Equ\Equ is different from remembering to write == instead of =? Feels much more cumbersome to write to me, disincentivizing adoption.

oheil · June 26, 2021, 2:51pm

While writing all this stuff you could have done the PR…

While I have read all this stuff, I could have done the PR…

jling · June 26, 2021, 3:12pm

infinitely this. Btw the title is misleading: == isn’t going anywhere

oheil · June 26, 2021, 3:35pm

It’s the “Hegemony of ==” which has to go (according to OP)!

From Hegemony - Wikipedia

the political, economic, or military predominance or control of one state over others

blackeneth · June 26, 2021, 4:05pm

You just write \Equ[TAB]

Try it in the REPL

Or \ie[TAB] if a new shortcut is created

jling · June 26, 2021, 4:14pm

at the moment you need to press: \<Shift>Equa<tab>, that’s 7 vs. == 2. IDK, seems not worth it bruh

Sukera · June 26, 2021, 4:50pm

Well, for me (german keyboard) it’s (<Right-ALT> + ß) + (<SHIFT> + e) + q + u + a + <TAB>/l + <TAB> for ⩵ vs. (<SHIFT> + 0) for =, so 9 vs. 2 keystrokes (or 9 vs. 3 for the double version, keeping SHIFT pressed for the second =)

That’s part of the problem with using unicode characters for core functionaliy - not everyone has the same keyboard layout, so it’s not necessarily simple to write these things, even with TAB-completion in the REPL.

As has been mentioned in the other thread though, you’re free to use those in your own packages, provided you ship the equivalent functions in all code you’ve packaged up so everything runs on other people’s machines as well (please expose an ASCII only interface as well - few things are more frustrating than having trouble using code because writing it uses too many unicode exclusive things). I just doubt it would be added to the core language.

DNF · June 26, 2021, 5:49pm

It sounds to me like what you actually need is just some better visual hints as to what character you are using. Perhaps a font with ligatures, though some of these de-emphasize rather than emphasize the difference between = and ==.

But I think the best solution would be to customize your syntax highlighting scheme. The assignment operator is arguably ‘special’, and it would make sense to give it a color that is different from other operators. I’m no good at hacking syntax highlighters, but maybe someone here has a tip how to do it? In the REPL, there is OhMyREPL.jl, but I’m not sure if it allows special-casing single characters.

CameronBieganek · June 26, 2021, 6:07pm

If you’re using VS Code, put the following into your “settings.json” file:

"editor.tokenColorCustomizations": {
    "[Julia (Monokai Classic)]": {
        "textMateRules" : [
            {
                "scope": "source.julia keyword.operator.update.julia",
                "settings": {
                    "foreground": "#E6DB74"
                }
            }
        ]
    }
}

For this example, I’ve used my Julia (Monokai Classic) theme, but you can use any theme you like. And of course you can change the color by changing the hex color code in the "foreground" property.

jzr · June 26, 2021, 6:52pm

You can also configure your julia editor/repl to insert == when you type \ie. I think that would address your concern that definining your own operator “makes you the only creepy dude in the world who does it, and you get puzzled questions from library maintainers” because you’re only changing your own editing environment, rather than the resulting source code.

I’m not exactly sure how to do it, but the relevant references seem to be

https://docs.julialang.org/en/v1/stdlib/REPL/#Customizing-keybindings

https://docs.julialang.org/en/v1/stdlib/REPL/#Tab-completion

sijo · June 26, 2021, 7:38pm

I just wanted to say that I enjoyed reading your post with its comprehensive argumentation and the little cognitive science detour. But despite its eloquence it doesn’t convince me that adding a Unicode synonym for == is a good idea… Partly because I generally prefer where there’s one way to do things (it makes every piece of code in the ecosystem more familiar) and partly because I still think the issue is better solved by tooling.

I didn’t find your objection to the lint convincing: it doesn’t matter if colors are blinking while you type. What matters is that when you move to the next line you will have a lint warning over there that stands out.

I also like the other suggestions in this thread. To summarize:

use the linter, or
use a font with a ligature for == that stands out, or
use distinctive syntax highlighting, or
define a shortcut like \ie that completes to ==.

These all seem workable to me.

D_A · June 26, 2021, 8:22pm

The idea of special ligatures for the font is interesting. It would certainly be very beneficial to have optional automatic replacements for = (turned into ← for example) and == (turned into = for example). Julia Mono already allows optional symbols suitable for |> and =>, after all. This would allow a personal local syntax without getting in the way of the outside world.

johnmyleswhite · June 27, 2021, 12:24pm

This feels a bit like an IDN homograph attack done with good intentions. It seems like a thing where Julia would want to do more canonicalization rather than to encourage distinguishing two very similar strings.

Tamas_Papp · June 27, 2021, 2:37pm

This is a non sequitur.

Also, as mentioned in the other topic you proposed this, in Julia

if a = b

already errors (with syntax: unexpected "="). But if you are super-concerned about this, just use a linter.

Finally, generally the point of Unicode aliases is to make code look shorter and similar to math. Your proposal does not help with either, so its very likely to end up unused, except by a few people. But since you can already define custom aliases for all functions (again, as mentioned in the other topic), you can just get do this without redesigning the language for everyone.

Let me make a counterproposal along these lines: define your favorite alias for ==, and see how it works out for you. If after 6 months / 10kLOC you still want to press 5+ keys to get what is basically ==, wrap it up in a mini-package and register it. If after a while you get a bunch of users, revive this topic: you will be in a much better position to argue for it.

cgeoga · June 27, 2021, 5:08pm

If anything, isequal is very slightly more appropriate for general-use purposes than == because of how it handles edge cases like different NaNs and stuff (try isequal(NaN, NaN*5) versus NaN==NaN*5. I’m not suggesting == is bad, but I think your response dismissing that solution is not particularly compelling). I agree with @johnmyleswhite that introducing a very similar looking character for effectively the exact same purpose as == invites a ton of problems. And it would probably lead to much stranger issues that are harder to track down than the occasional if x = 5 ; ....

lostella · June 27, 2021, 11:38pm

How is == (or the lack of an equivalent symbol) responsible for people mistakenly using = to test equality?

How would adding a single character alias for == fix the problem that, sometimes, = ends up used to test equality?

apo383 · June 28, 2021, 4:49pm

I don’t share OP’s difficulties with ==, but I do agree with their point One: [EDIT:] An alias using ligature ⩵ would look nice and be consistent with ≥. After all, what purpose could it serve other than an alias for ==?

Going a bit further, such aliases could conceivably be integrated further into the REPL. In Mathematica notebooks, characters get auto-magically converted as you type:
mydemo

Mathematica also has some smart cut-and-paste action. If ≥ is pasted back into a notebook, it remains ligature, but if pasted as plain text elsewhere it reverts back to separate characters >=. This feature seemed kind of freaky when they introduced it, but the reality is that it’s quite transparent and you never have to think about it. I was too complacent to learn \ge in Julia REPL, but wouldn’t mind being auto-corrected into something nicer looking.

Topic		Replies	Views
My Brain Is Hard-Wired Against ==; Help Me, Julia New to Julia question , syntax	97	14494	July 5, 2021
Aliases for `=` and `==` Internals & Design proposal	43	3183	May 25, 2021
Syntax: Escape hatch for unicode haters Internals & Design syntax , unicode	128	4512	January 16, 2024
Keeping the syntax and the need to memorise syntax simple Internals & Design	100	7481	September 7, 2022
On using `=` vs `:=` for assignment Offtopic	75	2490	October 21, 2024

The Hegemony Of `==` Must End; Welcome The `⩵` Upstart

Related topics