`in(x, y)` and `y in x` behave differently?

I made a dummy range

struct MyRange
    starting::Int
    ending::Int
end

for which I only care to check whether integers are contained there.
Consequently I implemented the Base.in

function Base.in(r::MyRange, i::Int)
    return r.starting <= i <= r.ending
end

Testing the implementation everything works as expected

julia> r = MyRange(4, 7)
MyRange(4, 7)

julia> in(r, 5) # works
true

But if I use the binary operator syntax it errors

julia> 5 in r # errors
ERROR: MethodError: no method matching iterate(::MyRange)

Closest candidates are:
  iterate(::Base.AsyncGenerator, ::Base.AsyncGeneratorState)
   @ Base asyncmap.jl:362
  iterate(::Base.AsyncGenerator)
   @ Base asyncmap.jl:362
  iterate(::Pkg.Registry.RegistryInstance)
   @ Pkg ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/Pkg/src/Registry/registry_instance.jl:
455
  ...

Stacktrace:
 [1] in(x::Int64, itr::MyRange)
   @ Base ./operators.jl:1292
 [2] top-level scope
   @ REPL[32]:1

I get it that I need to implement an iterator for that, but I just donā€™t want to.

For the sake of completeness the example works if I add the following methods but I find them redundant for my case.

function Base.iterate(r::MyRange)
    return (r.starting, 1)
end

function Base.iterate(r::MyRange, i::Int)
    i >= r.ending && return nothing
    return (r.starting + i, i+1)
end

I really thought that using any binary operator (say Ė£) as Ė£(a,b) or a Ė£ b is indistinguishable. It seems thatā€™s not the case. Is in a corner case because itā€™s not a commutative operator? But still, I feel that if Base.in(a, b) is implemented the b in a should bypass anything else and default to that.

1 Like

Your arguments are in the wrong order. x in y is equivalent to in(x, y), not in(y, x).

This wasnā€™t very explicit in the in docstring, so I submitted a pull request: note infix syntax for "in" and "āˆˆ" by stevengj Ā· Pull Request #54091 Ā· JuliaLang/julia Ā· GitHub

9 Likes

Julia is somewhat consistent in the order of these arguments, also for similar functions: When writing a function call, the correct order of arguments is what would be natural if the function could be used infix: contains, occursin, startswith, endswitch, etc.

Itā€™s contains(haystack, needle), since haystack contains needle would be the natural infix version.

Which is, in fact, another argument that all these functions (and others, maybe even all two-arg functions) should be available with an infix syntax. It quite boggles my mind that this idea never gained much traction in Julia. When it comes to readability, there are still a few slices that Julia could take of the Python cake.

2 Likes

Personally I donā€™t like infix in because itā€™s different from almost everything else in the language.


In haskell, you can write any function infix by surrounding it in backticks

> f x y = x + y
> f 1 2
3
> 1 `f` 2
3

I donā€™t like that everything else in the language is different from infix-in. So I think weā€™re actually in agreement :wink:

1 Like

Quite aside from the parsing challenges this would entail, would this not mean that every two-arg function would need to be assigned a precedence? Thereā€™s no mechanism for that presently, no way of going back and assigning them to all the two-arg functions which currently exist, and this would have to be conveyed to the parser, which poses serious problems with using a function so-defined in the same module as itā€™s defined in.

When I say ā€œparsing challengesā€ that isnā€™t even what I mean. I donā€™t think it would be decidable.

julia> a, b, c = 1, 2, 3
(1, 2, 3)

julia> [a b c]
1Ɨ3 Matrix{Int64}:
 1  2  3

julia> [a < c]
1-element Vector{Bool}:
 1

What if b were a two-arg function? What if the matrix were in a loop, and b changed from an int to a function?

julia> a, b, c = 1, <, 3
(1, <, 3)

julia> [a b c]
1Ɨ3 Matrix{Any}:
 1  <  3

julia> b(a, c)
true

Infix is special because the parser has to know about it.

Yes, there are languages which let you specify a custom operator and define a precedence for it. They usually restrict this to certain characters, without overlap with normal symbolic names, though exceptions exist, Prolog being one. Iā€™m glad I never wrote a parser for any of them, and writing parsers is what I do.

Canā€™t they all have equal (and low) precedence? That is, left to right?

Iā€™m guessing the inside of array constructors is already a somewhat special context, and at least in that context, it would seem pretty feasible if symbols have higher precedence than infix functions. That is, [a contains b] has three elements (which it currently does), and [(a contains b)] contains 1 element.

If wouldnā€™t even mind some pretty severe restrictions, like requiring that an infix functions have to return booleans and/or that they have to be enclosed in parentheses to be a complete expression. Wouldnā€™t ā€œanything that might be ambiguous needs parenthesesā€ be pretty workable?

Iā€™m not saying it wouldnā€™t make the parser more difficult to write, but itā€™s hard for me to imagine that if (a contains b) then, or contained = (a contains b) would be impossible to support if there was a will.

Right now, (x f y) is not a valid Julia expression, and I donā€™t really see a reason that that couldnā€™t lower to f(x, y). Although there are some pretty long threads about this topic both on Discourse and Github, and Iā€™m sure there was some objection to (x f y).

1 Like

in and isa both have 7, so that isnā€™t a non-starterā€¦

This, I donā€™t like. Now there are real infix and pseudo-infix functions:

julia> [1 in 1:5]
1-element Vector{Bool}:
 1

My objection is more like the antipathy some people feel for |> (which I like): there doesnā€™t need to be two ways to call a function (this would make three!), and it adds an edge case to the language where there doesnā€™t need to be one. If you get used to writing a contains b, and write [is_done has_flag "foo" contains char], thatā€™s a bug which never had to happen. in is a load-bearing part of the language, but I wouldnā€™t miss infix isa if we didnā€™t happen to have it.

Hereā€™s another ā€œneed to use parensā€ albeit a bit contrived:

julia> begin
         5
         <
         7
       end
7

julia> foo, bar, baz = :foo, :bar, :baz
(:foo, :bar, :baz)

julia> begin
         foo
         bar
         baz
       end
:baz

TL;DR, it canā€™t be successfully parsed, the parser simply has to know all the infix operators, or you have fake ones which only work sometimes.

julia> begin
         bux = 5
         >
         quux = 3
       end
3

It opens a real can of worms, is what Iā€™m saying.

Iā€™m fine writing if contains(needle, haystack), and consider this: at precedence 7, which, given our two letter-infix operators both have 7 it would be strange to pick an even lower number, if 7 > 5 min 6 is a MethodError, and if 5 min 6 < 7 is true.

I grant you that these problems could be solved with enough parentheses. I just donā€™t think itā€™s a good idea.

If you tend to forget about it, you also have a built-in curried version. This also avoids the need of Ref.

x    = [2, 4, 6]
list = [1, 2, 3]

in(list).(x)
1 Like