Non-unicode versions of unicode functions in base/stdlib?

galenlynch · August 13, 2018, 7:02pm

I’m looking at someone else’s code in Julia 1.0 and noticed they use the ∉ (not in) function, which doesn’t seem to have an ascii equivalent (e.g. notin). I realize that it’s possible to get the same functionality by typing ! (el in set), but it reminds me of the deprecation of @test_approx_eq in Julia 0.5 in favor of @test a ≈ b atol=ε in Julia 0.6.

I have to say that I’m a little alarmed at the existence of unicode-only functions in base. I’ve seen people raise similar concerns in various places at various times, and the answer has been something along the lines of “get a proper coding environment that supports unicode.” I often run Julia code in a cluster environment, and having to enter unusual unicode characters into scripts and packages over ssh/vim seems unnecessarily difficult.

Is there any policy on having every function and operator in base as well as the standard library have a method name that can be easily typed on any US keyboard? This would by no means precludes also having the unicode function for notational elegance and speed in less restrictive coding environments.

jebej · August 13, 2018, 7:55pm

From what I understand, yes, to some degree, we generally try to always give a non-unicode name to functions, and only alias it with unicode symbols for convenience.

You can always try to evaluate the symbol to see if it is an alias to a unicode function, like for ≈ and ∈:

julia> ≈
isapprox (generic function with 3 methods)

julia> ∈
in (generic function with 28 methods)

julia> ∉
∉ (generic function with 1 method)

You can see that both ≈ and ∈ are aliases for isapprox and in. ∉ is indeed its own function.

Could there be a notin function for ∉ to alias? Sure, but at some point you need to think about how many names you’ll end up with, and if those will start to overlap with potentially useful variable names that people might want to use. Here the decision was made to have in, and not notin.

I think that’s a fair tradeoff, you are not forced to use the unicode version if you can’t, and if you can, you get a more convenient syntax.

galenlynch · August 13, 2018, 8:01pm

Thanks for the prompt reply!

I would argue that saving the notin name for a function different than ∉ would be very confusing, and that notin should probably exist as an alias for ∉ . I think the same could be said for any ‘plain text’ translation of unicode characters. I do appreciate that most unicode functions in base have ascii equivalents, but it would be reassuring if there were a policy that all functions in base should have an ascii name.

jebej · August 13, 2018, 8:21pm

You could always make an issue about it. I don’t know how many functions there are that only have a unicode name, but I suspect it’s not a large number.

DNF · August 13, 2018, 8:27pm

I think that the correct non-unicode version of ∉ should be !in. This already works for function notation, as !in(a, b), but not for operators. There is an open issue here:

https://github.com/JuliaLang/julia/issues/25512

jeff.bezanson · August 13, 2018, 8:29pm

To me ∉ is an alias for !in. We should perhaps change it to be implemented that way. A related issue is that a function like ∉ is not intended to be overloaded separately; you should only define in. So the less it is considered a separate function in its own right, the better.

Edit: I’ll add that as far as I know, every unicode name is either an alias, or a quasi-alias like this one.

galenlynch · August 13, 2018, 9:01pm

As long as this remains the case then I’ll be a happy coder. Thanks!

galenlynch · August 14, 2018, 1:18am

Another unicode only operator: ∘

ptoche · May 14, 2021, 6:10am

N̶o̶ ̶c̶h̶a̶n̶g̶e̶ ̶o̶n̶ ̶t̶h̶i̶s̶ ̶i̶s̶s̶u̶e̶ ̶i̶t̶ ̶s̶e̶e̶m̶s̶.̶ (my misunderstanding, see below)

julia> ∉
∉ (generic function with 2 methods)

julia> ∘
∘ (generic function with 3 methods)

Apart from resisting having unicode-only operators, there is the practical matter of learning and remembering how to type some of them. On the one hand \notin is easy to remember. On the other hand \circ requires googling and reaching a place like this. Searching for “composition” or “function composition” here fails because it’s listed as “ring operator”. Note to self: ∘ is pronounced “circle” but \circle doesn’t work.

sijo · May 14, 2021, 8:35am

As mentioned above, you can use !in instead of ∉, for example !in(1, [2,3]). To discover this, try for example @less 1 ∉ [2,3] and the same for ∈.

Similarly for ∘: doing @less sin ∘ cos shows this is equivalent to ComposedFunction(sin, cos).

So there are already non-Unicode function names for these Unicode operators.

Concerning the input problem: you can use ? to find how to type these symbols. For example ?∘ shows:

help?> ∘
"∘" can be typed by \circ<tab>

This is expained here in the manual.

Of course you need to have the symbol somewhere to copy-paste…

ptoche · May 16, 2021, 2:03pm

Thanks sudete. I didn’t know about ComposedFunction(). It is not mentioned in the manual. I also didn’t know about @less. Very useful macro! I was under the impression from the discussion above that typing the unicode at the julia> prompt was going to return the “original” name, as with

julia> ∈
in (generic function with 38 methods)

so I assumed that

julia> ∘
∘ (generic function with 3 methods)

meant there was no “original” non-unicode definition, but looking at the output of @less sin ∘ cos suggests ComposedFunction is the real McCoy. (@less ∘ won’t work though and nor will @less f ∘ g if f and g are not functions). Thanks for sharing these very useful tricks.

Topic		Replies	Views
Why isn't there a `!in` or `!∈` function/operator? General Usage	7	563	March 7, 2023
Supporting syntax `x not in y` as alias for `!(x in y)` Internals & Design	48	1593	February 23, 2024
Julia's infix as synonym for ∉ operator? New to Julia question	9	811	June 16, 2022
Syntax: Escape hatch for unicode haters Internals & Design syntax , unicode	128	4505	January 16, 2024
Naming: Remove all underscores to matter what? General Usage	123	7093	January 28, 2018

Non-unicode versions of unicode functions in base/stdlib?

Related topics