Difference between U+007C “VERTICAL BAR” `|` and U+2223 “DIVIDES” `∣`

What is the difference between the two Unicode code points U+007C “VERTICAL BAR” | and U+2223 “DIVIDES” ? Julia REPL’s LaTeX-like Unicode input of \mid gives U+2223, and HTML’s named character references for ∣, ∣, ∣ and ∣ are U+2223, and for | and | are U+007C. I’m reading “Report on the Algorithmic Language ALGOL 60” and it uses |-like symbols (what is now called Backus–Naur form) to define ALGOL 60 reference representation. I want to write a document about ALGOL 60 in a modern computer. So I want to replicate the original definition text. I don’t know who first used the vertical line symbol as an “or” operator or seperator. What I know is the vertical line in set (or class) builder notation in mathematics \{x\mid \varphi(x) \} was invented and getting popular around from 1940s to 1950s, and in LaTeX, we often use \mid in it. Which should I use for replicating ALGOL 60’s text?

2 Likes

One answer to your first sentence is that the Unicode people like to encode semantics rather than appearances, so there are lots of examples of similar/different glyphs - eg asterisks.
Perhaps some fonts containing all the semantic alternatives could visually distinguish between the different meanings - but that might not be what you’re asking. :thinking:

In many fonts, U+2223 is shorter than U+007C: ∣|, ∣|. The reason Julia uses \mid for the former from the HTML entity name. I wouldn’t necessarily assume that the HTML &mid and the LaTeX \mid names come from the same semantic idea.

Given that the unicode is defined to semantically represent “divides” I think I’d use the straightforward ASCII.

3 Likes