A lot of discussion took place with My Brain Is Hard-Wired Against ==; Help Me, Julia, I thought I would post my response as one monolithic post.
The Proposal
Create a Unicode character alias for the relational operator ==
Possible Unicode Characters to use:
I present to you three possibilities:
U+0225F ≟ (question equals) \questeq
U+0229C ⊜ (circled equals) \circledequal
U+02A75 ⩵ (two consecutive equal signs) \Equal or \Equ
I would think the leading candidate would be U+02A75, as it looks the same as == and therefore has as similar visual appearance in code:
if x == 5
if x ⩵ 5
Typing shortcuts
The above all have typing shortcuts. Desire on that is short and easy to type (“easy to type” means: short, lowercase, and easy keys to reach for touch typists).
The “leading candidate,” ⩵, already has the shortcut that is not too bad \Equ.
However, there is an opportunity for an even better shortcut. Recall that ≠ has shortcut \ne. The shortcut “\ie” (“is equal”) is currently not used. We could have a nice correspondence between \ne (not equal, ≠) and \ie (equality, ⩵).
Bottom Line - What is Your Ideal Outcome
- Implement U+02A75 ⩵ as a Unicode alias for the == equality operator
- Create keyboard shortcut \ie to enter it.
What The Proposal Does Not Do
- Change how the == relational operator works
- Change how the relation is evaluated
Why Do this?
In the end, to benefit the user by helping the user avoid the mistake of using the assignment operator (=) for the relational operator (==)
Julia is uniquely positioned to mitigate this problem, which has existed since 1957.
Would People Use it?
People already use \ne \leq \geq and find benefit from it. A similar benefit is to be had from \Equ (or \ieq or \ie)
What Are the Justifications?
ONE - From Design
Computer programming languages, ideally, follow good design principles.
The design principles are minimality, referential transparency, orthogonality, correspondence, abstraction, pattern-based design, meta-model design.
Here is another: complete, balanced sets.
Julia currently has these standard comparison operators:
Operator | Alias | Name |
---|---|---|
== | equality | |
!= | ≠ | inequality |
< | less than | |
<= | ≤ | less than or equal to |
> | greater than | |
>= | ≥ | greater than or equal to |
=== | ≣ | equivalence |
Note how all two- and three-character operators have a single character Unicode alias – except for equality.
Adding ⩵ (“leading candidate”) for == would complete the set.
TWO - From Human Cognition
Computer programming languages are also ideally, compatible with how human’s think. Part of that is designing against common errors at human’s make.
Some computer languages are cognitively hard, either intentionally, or otherwise
Having an Unicode alias for == would help make Julia better for human cognition and help users avoid the mistake of using the assignment operator for the relational operator.
Human errors can be classified as “mistakes” or “slips”. What distinguishes them is the intention of the person. An error in the intention is called a mistake. An error in carrying out the intention is called a slip.
The error we are discussing is a “slip” – the user intended to type the relational operator (==), but instead typed the assignment operator (=). Furthermore, this is a skill-based error, as the user knows the language syntax and is skilled at typing.
So what causes the slip? Slips can be categorized based on their presumed sources. The slip of using = for == is could be one the following, or a combination of these:
- A blend slip, in which combination of components from two competing schemas (where a “schema” is an organized memory unit)
- A mode error, an erroneous classification of the situation. Users move among many different environments, where “=” has different meanings: mathematics, and some programming languages use “=” as a relational operator.
- Associative activation, as the currently active schema activate others which they are associated.
Adding to the probability of making this slip is high familiarity with the task and low attention to it – our being on “auto pilot”. Skilled touch typists don’t have to think about how to move their fingers to generate letters and words; thought gets translated to action without cognitive effort. The user’s cognitive effort is focused on solving the problem for which they’re writing a program. Furthermore, actions done on “auto-pilot” are more subject to corruption from interruptions.
sources:
“Design Rules Based on Analysis of Human Error,” by Donald A. Norman, Communications of the ACM, April 1983, Vol. 26, No. 4
“A comprehension-based model of correct performance and errors in skilled, display-based, human-computer interaction,” by Muneo Katajima and Peter G. Polson, Int. J. of Human-Computer Studies (1995) 43, 65-99
“Categorization of Action Slips,” by Donald A. Norman, Psychological Review, Jan. 1981, Vol. 88, No. 1
THREE - From Data
This is a common error. How do we know?
A) Linters and compilers are programmed to generate a warning for it.
The error in question is syntactically correct. Why would a linter or a compiler generate a warning? Because it’s a common error.
The VSCode Linter for Julia highlights it.
gcc has command line switch to warn on it (-Wall or -Wparentheses)
SAS JMP JSL will pop up a dialog upon script execution to ask the user if they really intended to do assignment in the if (answer: always no)
Other erroneous, but syntactically correct code is not highlighted, such as:
if a==5 && a==6
println("a is both 5 and 6")
end
if 1 & 0 > 9
println("Happy day")
end
if findfirst(@assert) chop(@macro) diagonal()
println("nonsense passes lint")
end
B) the development of Yoda conditions to guard against this error.
C) Many lists of “common programming errors” and other discussions include using the assignment operator for the relational operator:
Top 5 Most Common Silly C++ Mistakes – see comments
The 10 Most Common C++ Mistakes
Which common mistakes do beginner C++ programmers often commit?
Confusion with assignment operators
Assignment versus equality
Some Common C Programming Bugs
8 Common Programming Mistakes
Common Beginner C++ Programming Mistakes
Exposing The Most Frequent Mistakes In Programming
Testimony
@D_A
I understand you very well and I am glad to see that I am not alone…
Personally, as I often work with SQL, XSL (and XPath), I regularly make this mistake too by putting = instead of == for equality, and as the ≠ symbol is shorter and cleaner (I never use !=), I tend to do my tests reversed with ≠ instead of direct ==…
Objections
Objection: You can do it yourself
julia> a ≟ b = a == b
≟ (generic function with 1 method)
julia> 4 ≟ 4
true
I replied: that makes you the only creepy dude in the world who does it, and you get puzzled questions from library maintainers.
@Tamas_Papp replied
Nay, that’s fine, others can just look up the function in your code, the tooling is there, like for any other function.
If you want an alias, just do it as @jzr suggested, but it is unlikely that you can make others change the way they code, or that the language should support you in this attempt.
@sijo replied:
I disagree, I think it’s generally a bad idea to introduce non-standard syntax without a strong reason. “Others can just look up…” may be right but missing the larger picture, which is that even small non-standard things add up. This can make codebases significantly less accessible to new readers/contributors. Even one idiosyncrasy is an unnecessary obstacle that can be annoying to people who jump through many codebases in a single day.
Response: redefining it yourself is code obfuscation. If I contribute that code to a library, I haven’t just done it for myself, but for you too. What it someone takes my code and wants to modify it? Do they follow my non-standard syntax or the standard syntax?
Defining your own syntax violates the program design principles of simple, reliable, and adaptable. Namely, not simple and not adaptable.
Objection: it would break code already using the chosen symbol
my assertion
it doesn’t break anything
@Tamass_Papp
Technically it would break code already using these symbols as a function name or something.
Also, not using up a lot of the Unicode operator selection with various defaults was a very sane decision for Julia, since that allows users to make use of them.
Response: True enough, but that’s also true for the Unicode aliases released in 1.7 beta:
- ⫪ (U+2AEA, \Top, \downvDash) and ⫫ (U+2AEB, \Bot, \upvDash, \indep) may now be used as binary operators with comparison precedence (#39403).
- The middle dot · (\cdotp U+00b7) and the Greek interpunct · (U+0387) are now treated as equivalent to the dot operator ⋅ (\cdot U+22c5) (#25157).
- The minus sign − (\minus U+2212) is now treated as equivalent to the hyphen-minus sign - (U+002d) (#40948).
Or in 1.6:
- ꜛ (U+A71B), ꜜ (U+A71C) and ꜝ (U+A71D) can now also be used as operator suffixes. They can be tab-completed from ^uparrow, ^downarrow and ^! in the REPL (#37542).
I can go on. Hey, why not one more:
In 1.5:
- ⨟ is now parsed as a binary operator with times precedence. It can be entered in the REPL with \bbsemi followed by TAB (#34722).
The Unicode standard has 143,859 characters. We aren’t significantly depleting that resource if we use one more for an alias for ==.
Objection: would this really help the cognitive problem?
@yha asks
I don’t see how an alias for == would help with your original problem of mistakenly using = when you mean ==. What’s the connection?
Response:
It helps in that it replaces using compound symbol – of two equal signs – which have a strong association with “equality” with \ie (or \Equ or \ieq).
One won’t mistake \ie for = .
I use \ne, \leq, and \geq all the time, even though it requires me to type more characters (including the annoying \ one). Also, it slows you down, so you think about it more (not on “auto pilot”).
Objection: It’s just one of many common errors
@oheil replied:
There are a lot of common errors. E.g. forgetting ;
at the end of a line, forgetting a closing "
at the end of a string, using '
instead of "
for strings, just to name a few which come to mind for this and other languages.
Response: Not quite the same.
Forgetting to use “;”, or using “;” when you don’t need to is a slip with a mode cause.
Using " or ’ for strings, when you should use the other, is a slip with mode cause.
Whereas using = for == combines aspects of blend, mode, and association activation.
A (hypothetical) analogous situation could arise for booleans:
∧ is the symbol for AND, and ∨ is the symbol for OR.
In our (hypothetical) language, however, ∧ is used for exponentiation.
So when you want to use AND in a boolean expression, you have to type ∧∧
Objection: the VSCode linter already highlights it
Response: this is not a very effective safety for catching this problem. The VSCode interface is an angry fruit salad and the linter is a fireworks show on top of it.
What do I mean by that? While typing, the linter is always highlighting stuff, popping up boxes for completions, or definitions, or underlining stuff, or changing the colors. You type in a bit of code, and of course – since you haven’t completed it yet – it breaks the downstream syntax, so the linter highlights and recolors it, only to revert back once you’re done typing. Or you type in “#=” to start a block comment, and all the text afterwards changes color the “comment color”, only to revert again when you put the “=#” in. In short, the linter cries wolf too often. This leads humans to ignore it. Ironically, it might be more effective if it were slower.
Objection: just use isequal()
Response: not a bad idea, although a little troubling it doesn’t act exactly like ==. So not a direct replacement.
Finally:
@Tamas_Papp
I am not sure that there is anything else that can be done here, short of consulting a brain specialist.
Response:
That would have to be a team of brain specialists, ideally at a research university.