`in(x, y)` and `y in x` behave differently?

filchristou · April 15, 2024, 1:46pm

I made a dummy range

struct MyRange
    starting::Int
    ending::Int
end

for which I only care to check whether integers are contained there.
Consequently I implemented the Base.in

function Base.in(r::MyRange, i::Int)
    return r.starting <= i <= r.ending
end

Testing the implementation everything works as expected

julia> r = MyRange(4, 7)
MyRange(4, 7)

julia> in(r, 5) # works
true

But if I use the binary operator syntax it errors

julia> 5 in r # errors
ERROR: MethodError: no method matching iterate(::MyRange)

Closest candidates are:
  iterate(::Base.AsyncGenerator, ::Base.AsyncGeneratorState)
   @ Base asyncmap.jl:362
  iterate(::Base.AsyncGenerator)
   @ Base asyncmap.jl:362
  iterate(::Pkg.Registry.RegistryInstance)
   @ Pkg ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/Pkg/src/Registry/registry_instance.jl:
455
  ...

Stacktrace:
 [1] in(x::Int64, itr::MyRange)
   @ Base ./operators.jl:1292
 [2] top-level scope
   @ REPL[32]:1

I get it that I need to implement an iterator for that, but I just don’t want to.

For the sake of completeness the example works if I add the following methods but I find them redundant for my case.

function Base.iterate(r::MyRange)
    return (r.starting, 1)
end

function Base.iterate(r::MyRange, i::Int)
    i >= r.ending && return nothing
    return (r.starting + i, i+1)
end

I really thought that using any binary operator (say ˣ) as ˣ(a,b) or a ˣ b is indistinguishable. It seems that’s not the case. Is in a corner case because it’s not a commutative operator? But still, I feel that if Base.in(a, b) is implemented the b in a should bypass anything else and default to that.

stevengj · April 15, 2024, 1:49pm

Your arguments are in the wrong order. x in y is equivalent to in(x, y), not in(y, x).

This wasn’t very explicit in the in docstring, so I submitted a pull request: note infix syntax for "in" and "∈" by stevengj · Pull Request #54091 · JuliaLang/julia · GitHub

goerz · April 15, 2024, 7:56pm

Julia is somewhat consistent in the order of these arguments, also for similar functions: When writing a function call, the correct order of arguments is what would be natural if the function could be used infix: contains, occursin, startswith, endswitch, etc.

It’s contains(haystack, needle), since haystack contains needle would be the natural infix version.

Which is, in fact, another argument that all these functions (and others, maybe even all two-arg functions) should be available with an infix syntax. It quite boggles my mind that this idea never gained much traction in Julia. When it comes to readability, there are still a few slices that Julia could take of the Python cake.

jar1 · April 15, 2024, 8:00pm

Personally I don’t like infix in because it’s different from almost everything else in the language.

In haskell, you can write any function infix by surrounding it in backticks

> f x y = x + y
> f 1 2
3
> 1 `f` 2
3

goerz · April 15, 2024, 8:19pm

I don’t like that everything else in the language is different from infix-in. So I think we’re actually in agreement

mnemnion · April 15, 2024, 10:07pm

Quite aside from the parsing challenges this would entail, would this not mean that every two-arg function would need to be assigned a precedence? There’s no mechanism for that presently, no way of going back and assigning them to all the two-arg functions which currently exist, and this would have to be conveyed to the parser, which poses serious problems with using a function so-defined in the same module as it’s defined in.

When I say “parsing challenges” that isn’t even what I mean. I don’t think it would be decidable.

julia> a, b, c = 1, 2, 3
(1, 2, 3)

julia> [a b c]
1×3 Matrix{Int64}:
 1  2  3

julia> [a < c]
1-element Vector{Bool}:
 1

What if b were a two-arg function? What if the matrix were in a loop, and b changed from an int to a function?

julia> a, b, c = 1, <, 3
(1, <, 3)

julia> [a b c]
1×3 Matrix{Any}:
 1  <  3

julia> b(a, c)
true

Infix is special because the parser has to know about it.

Yes, there are languages which let you specify a custom operator and define a precedence for it. They usually restrict this to certain characters, without overlap with normal symbolic names, though exceptions exist, Prolog being one. I’m glad I never wrote a parser for any of them, and writing parsers is what I do.

goerz · April 15, 2024, 11:09pm

Can’t they all have equal (and low) precedence? That is, left to right?

I’m guessing the inside of array constructors is already a somewhat special context, and at least in that context, it would seem pretty feasible if symbols have higher precedence than infix functions. That is, [a contains b] has three elements (which it currently does), and [(a contains b)] contains 1 element.

If wouldn’t even mind some pretty severe restrictions, like requiring that an infix functions have to return booleans and/or that they have to be enclosed in parentheses to be a complete expression. Wouldn’t “anything that might be ambiguous needs parentheses” be pretty workable?

I’m not saying it wouldn’t make the parser more difficult to write, but it’s hard for me to imagine that if (a contains b) then, or contained = (a contains b) would be impossible to support if there was a will.

Right now, (x f y) is not a valid Julia expression, and I don’t really see a reason that that couldn’t lower to f(x, y). Although there are some pretty long threads about this topic both on Discourse and Github, and I’m sure there was some objection to (x f y).

mnemnion · April 15, 2024, 11:45pm

in and isa both have 7, so that isn’t a non-starter…

This, I don’t like. Now there are real infix and pseudo-infix functions:

julia> [1 in 1:5]
1-element Vector{Bool}:
 1

My objection is more like the antipathy some people feel for |> (which I like): there doesn’t need to be two ways to call a function (this would make three!), and it adds an edge case to the language where there doesn’t need to be one. If you get used to writing a contains b, and write [is_done has_flag "foo" contains char], that’s a bug which never had to happen. in is a load-bearing part of the language, but I wouldn’t miss infix isa if we didn’t happen to have it.

Here’s another “need to use parens” albeit a bit contrived:

julia> begin
         5
         <
         7
       end
7

julia> foo, bar, baz = :foo, :bar, :baz
(:foo, :bar, :baz)

julia> begin
         foo
         bar
         baz
       end
:baz

TL;DR, it can’t be successfully parsed, the parser simply has to know all the infix operators, or you have fake ones which only work sometimes.

julia> begin
         bux = 5
         >
         quux = 3
       end
3

It opens a real can of worms, is what I’m saying.

I’m fine writing if contains(needle, haystack), and consider this: at precedence 7, which, given our two letter-infix operators both have 7 it would be strange to pick an even lower number, if 7 > 5 min 6 is a MethodError, and if 5 min 6 < 7 is true.

I grant you that these problems could be solved with enough parentheses. I just don’t think it’s a good idea.

alfaromartino · April 15, 2024, 11:53pm

If you tend to forget about it, you also have a built-in curried version. This also avoids the need of Ref.

x    = [2, 4, 6]
list = [1, 2, 3]

in(list).(x)

Topic		Replies	Views
"contains" as operator? Internals & Design	21	3185	October 3, 2023
Supporting syntax `x not in y` as alias for `!(x in y)` Internals & Design	48	1592	February 23, 2024
Julia's infix as synonym for ∉ operator? New to Julia question	9	811	June 16, 2022
Make contains infix Internals & Design proposal	4	767	July 4, 2017
Infix operator New to Julia functions	6	1615	January 10, 2022

`in(x, y)` and `y in x` behave differently?

Related topics