Understanding issorted's lt keyword

I don’t see where the documentation specifies how issorted should treat equal elements. My guess is that the authors of issorted() assumed lt to be a strict comparison (i.e. lt(a,a) == false) but failed to document this. lt = <= is then simply an invalid input.

1 Like

I think that’s what I want. Uniqueness seems unrelated to sortedness.

Why would I want that?

I don’t think we should change the documentation; instead we should change the behavior to conform to the documentation.

The documentation says

Test whether a vector is in sorted order. The lt, by and rev keywords modify what order is considered to be sorted just as they do for sort.

IMO the most obvious reading of “just as they do for sort” – and the most obvious behavior irrespective of documentation – is that issorted(sort(seq, lt=f), lt=f) is true.

1 Like

Sorry, I didn’t mean to imply that that’s what you want, it is the motivation of the OP:

Aha, for that case, I would use sort ∘ unique to change the sequence, and allunique(seq) && issorted(seq) to check.

I don’t see where the current behavior contradicts the documentation. The documentation is just vague. It doesn’t tell you whether lt should implement a total order or a strict total order, although it uses isless as a default, which implements a strict total order.

1 Like

If A has non-unique elements, I guess sort(A; lt=<=) should work, and sort(A; lt=<) should throw an error. I think.

Furthermore, if A has non-unique elements issorted(A; lt=<) should return (after sorting) false, and issorted(A; lt=<=) should return true.

I agree. It’s just that the behaviour is extremely counter-intuitive. To some, at least. Do you find the current behaviour intuitive? As in, if you just read the docstring for the first time, this is what you would expect:

jl> issorted([1,2,3,3]; lt=<=)
false

jl> issorted([1,2,3,3]; lt=<)
true

?

If this is reasonably obvious to many users, then I guess this comes down to habit, I just thought the obvious behaviour was

issorted([a, b]; lt=foo)  <==> foo(a, b)
1 Like

Personally, I would have never provided <= as lt. After all it stands for “less than” and not “less than or equal”. So, for me, the strict order constraint, although not explicitly mentioned, coincides with my intuition.

(I would be fine with expanding the documentation in this regard or changing the implementation without modifying the behaviour for strict order comparisons.)

5 Likes

I’m struggling to understand this argument. It seems to argue about what sort of comparisons one ought to input, not how the comparison should be used.

You can provide whatever binary function you want, not sure why the abbreviation should dictate that. <= is accepted. The way I read it is

lt=foo : a is considered less than b if foo(a, b) is true.

You could use lt=(>) if you like, or even lt=(x,y)->rand(Bool).

Are sort ∘ unique and allunique(seq) && issorted(seq) really the most efficient way to accomplish those tasks?

For the former, perhaps sorted_unique(x) = (unique! ∘ sort! ∘ copy)(x)

That is a very good example. Yes, you can input any binary function that returns a Bool. But if you would actually use lt=(x,y)->rand(Bool), would you consider the result useful?

If yes, do you think the current implementation of sort/issorted is at fault for making this lt not work in a useful way?

If no, then why do you think the result for lt=<= would be useful?

1 Like

No, the most efficient would probably be a loop.

If you want a random shuffle, why not? Useful is in the eye of the beholder.

Not sure I understand the question. It would be useful because every pair of elements would satisfy that comparison, every element would be ‘less than or equal to’ the next element.

On the other hand, lt=< would mean that every element is strictly smaller than the next. But that’s not the current behavior.

Okay. So let’s assume you did sort(A, lt=(x,y)->rand(Bool)). You now expect A to be sorted according to lt, so issorted(A, lt=(x,y)->rand(Bool)) should return true. You try it and it returns false! Do you consider that a bug in sort/issorted?

No, this is a reasonable exception, occurring because I input a ‘perverse’ comparison function.

I can do shuffle!(A) but testing for ‘shuffleness’ wouldn’t make sense.

1 Like

But <= is a ‘perverse’ comparison function as well, because it does not implement a strict total order.

Why would I care about that. It implements a reasonable comparison.

“Each element is less than or equal to the next” is a perfectly fine way of sorting, and is in fact what sort(A) returns.

How do you define “reasonable comparison”? For me, in the context of sort/issorted a reasonable comparison is defined by a strict total order.

If sort explicitly wants a “lesser than” function, a “lesser than or equal” function does not seem perfectly fine to me. How would Julia figure out whether you supplied a strict total order or a non-strict total order?

I don’t. Whatever works is fine. Including random. It’s just that you cannot expect random sorts to be repeatable.

But then how can you do sorting of non-unique vectors? Every element will not be strictly less than the next, therefore not sorted.

Where is this made explicit? The closest I could find was

the lt keyword allows providing a custom “less than” function

in the manual, and nothing in the docstring.

But it does actually accept “lesser than or equal”, since issorted([1,2,2,3]) is true, and that is only sorted if equal is accepted as “less than”.

Why should it figure that out?