Understanding issorted's lt keyword

So I don’t don’t know if I’m in the minority here, but what happens if the design confounds the expectations of (say, for the sake of argument) 99% of users. What then?

Also, even after this long thread with many explanations, I am totally confused about what it means for something to be sorted. And I also have zero idea why the current design is better for generic programming. Or why ‘strict total ordering’ matters. Was this adressed?

I’m clearly being difficult, but I am curious, and I do have a math degree, and have sorted countless lists. How can this be explained to a ‘normal user’?

It sounds like you are willing to accept the claim that isless and isequal is more fundamental than <=. So, then your next struggle is also your solution: you must design a definition of “sequence is sorted” that relies on isless. You then notice (after a weird headache) that you can do it.

This switching of order and negation causes weird results, and the motivation is opaque to me.

I find it wrong to say that that’s the reason for the weird results. Some people are getting weird results because they are guessing incorrectly at an implementation of a function. Not only is the guess incorrect, but so would be the implementation for the default value isless.

1 Like

I don’t object to that. But why must sort be implemented in terms of the most fundamental function?

1 Like

The definition Julia uses is that a list is sorted iff for every pair of elements, either:

  1. The elements are equal
  2. The first element is less than the second

The reason for this design is it only relies on users to define ==(a::T,b::T) and <(a::T,b::T).
The reason to define sort in terms of the more fundamental function is to increase the chance that user defined methods will be used when applicable.

2 Likes

The design perfectly fits the expectation of users who know the mathematical definition of a total order, and that is much more than 1% of users.

2 Likes

Then it would help a lot if the first clause was made very explicit. And that the lt input is subject to very strict requirements.

1 Like

Yeah. The docs probably should say something along the lines of “lt should be a function that forms a total order when combined with isequal

That is not correct. All you need is < / isless. Take for example

julia> mutable struct Wrap
       _::Int
       end

julia> Base.isless(a::Wrap, b::Wrap) = isless(a._, b._)

julia> Wrap(3) == Wrap(3)
false

julia> issorted(map(Wrap, 1:10))
true

julia> issorted(map(Wrap, [1, 1, 1]))
true

The number was for the sake of argument. But how big do you think that number is?

And as I said, the docs don’t even mention “total order” (or do they), so why would I expect that to be relevant?

The documentation for Base.isless does mention “total order” (Understanding issorted's lt keyword - #2 by goretkin) . And isless is the default argument. The connection isn’t as clear as could be, but it’s there.

The default argument creates(?) a total ordering, but it doesn’t follow that all arguments must do so.

Oh, well, it’s getting late. Maybe sleeping on it will help.

1 Like

I don’t see a ranking of fundamentalness between < and <=. It’s possible to define either in terms of the other. A total order is normally defined in terms of <=.

Checking the strict-sortedness of a collection that allows duplicates seems like a type error. isstrictlysorted(x::OrderedSet) is fine but isstrictlysorted(x::Vector) doesn’t make much sense, or at least is error prone.

1 Like

My contention is that issorted already contains that loop. What I’m not sure about is if I would be I’m exploiting documented behavior or an implementation detail by using issorted( ..., lt = <=).

Yes, both choices are possible. And a choice has already been made. This allows Base to have fallback methods.

Can you explain what you don’t see here in the previous post? You can also just look at the docstrings for isless, <, <=, etc.

To be precise, my claim was about isless and <=, not < and <=. And to be less precise, I did use scare quotes around fundamental. I would not say one of these concepts are inherently more fundamental than the other, but a choice has been made to define (by default) comparisons in terms of isless, and not e.g. to define isless in terms of <=.

A user is able to make another choice for their own type. They could define <= first, and then define isless in terms of <= and ==. Their method definition for isless will look generic, but because the user has gone against the design intention of the generic functions, these methods must be limited to their own type, otherwise there will be method ambiguities.

I am really struggling to understand the confusion here. Some people are wishing that issorted was defined in terms of something like <=. Okay, but it’s not. That it is defined in terms of something like isless is reasonable.

3 Likes

I don’t think running isstrictlysorted(::ArrayWithDuplicates) is usually the user’s intent when they run issorted(lst), and having footguns lying around is a bad property in a language.

I’m arguing we should change it. Several ideas arise.

  1. Rename issorted to isstrictlysorted and introduce isweaklysorted which behaves in the expected way
  2. Error on collections that contain duplicates.
  3. Error on duplicate-allowing collection types. E.g. error on Vector, allow for OrderedSet.

I prefer not to error (2,3) because it reduces genericness. Having explicit function names (1) is better.

(Sorry about all the edits.)

First of all, what does this mean? Secondly, just because the answer to something is false, doesn’t mean it’s unreasonable to ask the question.

Please show an actual piece of code that you think is a footgun.

I am going to guess that it’s this:

julia> issorted([1, 1], lt = <=)
false

And the footgun is that a user expects that it returns true? The user will have to have read the docstring for issorted to even know the keyword argument is called lt:

Test whether a vector is in sorted order. The lt, by and rev keywords modify what order is considered to be sorted just as they do for sort.

So, after reading that, the user passes in lt = <=. What is the logical or emotional justification for a user doing that?

Yeah I deleted that line :slight_smile:

This edit still doesn’t help me understand what you meant.

Current behavior:

julia> issorted([1, 1])
true

it’s not a test of whether the list (which, by the way, contains duplicates) is “strictly sorted”.

Sorry for the confusion, I wrote inaccurately above. The unexpected behavior is

julia> issorted([1,1], lt=<=)
false

and more generally that issorted(sort(seq, lt=f), lt=f) isn’t necessarily true for collections with duplicates.

Maybe a solution is to provide an le= parameter.

It is true if you pass in appropriate choices for f. The solution is to document appropriate choices directly. Right now it’s quite indirectly documented in sort!:

the lt keyword allows providing a custom “less than” function

So the footgun, aside from the indirect documentation, is that <= seems like a reasonable choice for a “less than” function to you. So a solution would be to add e.g.:

Note, <= is not a “less than” function.

1 Like