What is the difference between the two Unicode code points U+007C “VERTICAL BAR” |
and U+2223 “DIVIDES” ∣
? Julia REPL’s LaTeX-like Unicode input of \mid
gives U+2223, and HTML’s named character references for ∣
, ∣
, ∣
and ∣
are U+2223, and for |
and |
are U+007C. I’m reading “Report on the Algorithmic Language ALGOL 60” and it uses |
-like symbols (what is now called Backus–Naur form) to define ALGOL 60 reference representation. I want to write a document about ALGOL 60 in a modern computer. So I want to replicate the original definition text. I don’t know who first used the vertical line symbol as an “or” operator or seperator. What I know is the vertical line in set (or class) builder notation in mathematics \{x\mid \varphi(x) \} was invented and getting popular around from 1940s to 1950s, and in LaTeX, we often use \mid
in it. Which should I use for replicating ALGOL 60’s text?
2 Likes
One answer to your first sentence is that the Unicode people like to encode semantics rather than appearances, so there are lots of examples of similar/different glyphs - eg asterisks.
Perhaps some fonts containing all the semantic alternatives could visually distinguish between the different meanings - but that might not be what you’re asking.
In many fonts, U+2223 is shorter than U+007C: ∣|
, ∣|. The reason Julia uses \mid
for the former from the HTML entity name. I wouldn’t necessarily assume that the HTML &mid
and the LaTeX \mid
names come from the same semantic idea.
Given that the unicode is defined to semantically represent “divides” I think I’d use the straightforward ASCII.
3 Likes