Understanding issorted's lt keyword

mkitti · May 10, 2021, 9:34pm

The lt keyword for issorted and sort! is defying my intuition. From reading the source code, I understand why I get the following results, but I am finding it counterintuitive.

From the documentation for sort!:

the lt keyword allows providing a custom “less than” function

Let’s try it:

julia> issorted([1, 2, 3, 3])
true

julia> issorted([1, 2, 3, 3], lt = <)
true

julia> issorted([1, 2, 3, 3], lt = <=)
false

julia> issorted( sort([1, 2, 3, 4, 3], lt = <=), lt = <=)
false

I would think confirm_sort(A, B) = issorted( sort( A, lt = B), lt = B) would always evaluate to be true.

Can anyone explain how to think of the lt keyword intuitively?

goretkin · May 10, 2021, 9:59pm

help?> issorted
search: issorted InsertionSort

  issorted(v, lt=isless, by=identity, rev:Bool=false, order::Ordering=Forward)

  Test whether a vector is in sorted order. The lt, by and rev keywords modify what order is considered to be sorted just as they do for sort.

help?> sort
[...] 

  sort(v; alg::Algorithm=defalg(v), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)

The default lt function is isless.

help?> isless
[...]

  isless(x, y)

  Test whether x is less than y, according to a fixed total order. isless is not defined on all pairs of values (x, y). However, if it is defined, it is expected to satisfy the
  following:

    •  If isless(x, y) is defined, then so is isless(y, x) and isequal(x, y), and exactly one of those three yields true.

Note that lt = <= does not satisfy that property, since for x == y, all three (lt(x, y), lt(y, x), isequal(x, y)) are true.

˜~~Perhaps more intuitively, ask yourself how you might implement issorted, and what you’d expect it to be able to do just given the <= function?~~~ [ Edit: I’m not sure what I had meant by this]

jzr · May 10, 2021, 10:03pm

I don’t see how that explains

julia> issorted([1,1], lt=<=)
false

mkitti · May 10, 2021, 10:05pm

Part of the reason I am interested in lt = <= is that I actually want to verify that an iterable is both sorted and unique. I want the sequence to be strictly monotonically increasing, not just sorted.

My intuition would have been to indicate lt = < to accomplish that, but that is wrong.

goretkin · May 10, 2021, 10:13pm

You are probably guessing at an implementation for issorted that is incorrect. It’s not that lt(x, y) must be satisfied for consecutive x and y. Otherwise issorted([1, 1], lt=isless) would be false.

jzr · May 10, 2021, 10:19pm

What is it then?

goretkin · May 10, 2021, 10:22pm

I think I see the pattern you are following, but ultimately issorted is likely to be the wrong concept, since you’re not asking if a list is sorted, but you’re asking if it’s strictly increasing. I think you should think of issorted as a verifier for sort. There’s nothing that sort can do to make some data strictly increasing and still look like a sort (it would have to e.g. discard some data).

Consider something instead like:

julia> using IterTools: partition

julia> all_consecutive_pairs(pred, itr) = all(Base.splat(pred), partition(itr, 2, 1))
all_consecutive_pairs (generic function with 1 method)

julia> all_consecutive_pairs(<, [1, 2, 3])
true

julia> all_consecutive_pairs(<, [1, 3, 3])
false

jzr · May 10, 2021, 10:33pm

This still doesn’t make sense to me.

julia> issorted([1,1], lt=<=)
false

mkitti · May 10, 2021, 11:11pm

Here is the source code for issorted:

https://github.com/JuliaLang/julia/blob/bb5b98e72a151c41471d8cc14cacb495d647fb7f/base/sort.jl#L56-L68

https://github.com/JuliaLang/julia/blob/bb5b98e72a151c41471d8cc14cacb495d647fb7f/base/sort.jl#L91-L93

Basically it goes over all the numbers. If the current number is “less than” the previous number, then it reports that the iterable is not sorted.

In the issorted([1,1], lt=<=) case, it instead asks if the current number is “less than or equal to” the previous number. If true, then it indicates the array is not sorted. The evaluation here boils down to

if 1 <= 1
    return false
end

jzr · May 10, 2021, 11:28pm

Thanks. Tbh that behavior still seems incongruous with the docs which say “Test whether a vector is in sorted order.”

mkitti · May 11, 2021, 12:40am

Ultimately, I think @goretkin is right that the implementation is key. Currently, the default is

if this < prev
    return false
end

The alternative would be as follows.

if !( this >= prev )
    return false
end

The second version seems less efficient since you have to do the >= comparison and then negate it. However, this is Julia, and we have a fancy compiler.

julia> function f(this, prev)
           if this < prev
               return false
           end
           return true
       end
f (generic function with 1 method)

julia> function g(this, prev)
           if !( this >= prev )
               return false
           end
           return true
       end
g (generic function with 1 method)

julia> f(1,2)
false

julia> g(1,2)
false

julia> f(2, 1)
true

julia> g(2,1)
true

julia> @code_llvm debuginfo=:none f(2, 1)
; Function Attrs: uwtable
define i8 @julia_f_816(i64 signext %0, i64 signext %1) #0 {
top:
  %.not = icmp sge i64 %0, %1
  %spec.select = zext i1 %.not to i8
  ret i8 %spec.select
}

julia> @code_llvm debuginfo=:none g(2, 1)
; Function Attrs: uwtable
define i8 @julia_g_818(i64 signext %0, i64 signext %1) #0 {
top:
  %.not = icmp sle i64 %1, %0
  %spec.select = zext i1 %.not to i8
  ret i8 %spec.select
}

Both versions compile to very similar code.

julia> @code_native debuginfo=:none issorted([1, 2, 3])
        .text
        pushq   %rbp
        movq    %rsp, %rbp
        movq    8(%rcx), %r9
        movb    $1, %al
        testq   %r9, %r9
        je      L57
        cmpq    $1, %r9
        je      L57
        movq    (%rcx), %r8
        movq    (%r8), %rdx
        movl    $2, %ecx
L32:
        movq    %rdx, %r10
        movq    -8(%r8,%rcx,8), %rdx
        cmpq    %r10, %rdx
        jl      L55
        cmpq    %r9, %rcx
        jae     L57
        incq    %rcx
        jmp     L32
L55:
        xorl    %eax, %eax
L57:
        popq    %rbp
        retq
        nopl    (%rax,%rax)

julia> import Base: issorted, lt

julia> function issorted(itr, order::Base.Order.Ordering)
           y = iterate(itr)
           y === nothing && return true
           prev, state = y
           y = iterate(itr, state)
           while y !== nothing
               this, state = y
               !lt(order, prev, this) && return false
               prev = this
               y = iterate(itr, state)
           end
           return true
       end
issorted (generic function with 5 methods)

julia> @code_native debuginfo=:none issorted([1, 2, 3], lt = <= )
        .text
        pushq   %rbp
        movq    %rsp, %rbp
        movq    8(%rcx), %r9
        movb    $1, %al
        testq   %r9, %r9
        je      L57
        cmpq    $1, %r9
        je      L57
        movq    (%rcx), %r8
        movq    (%r8), %rdx
        movl    $2, %ecx
L32:
        movq    %rdx, %r10
        movq    -8(%r8,%rcx,8), %rdx
        cmpq    %rdx, %r10
        jg      L55
        cmpq    %r9, %rcx
        jae     L57
        incq    %rcx
        jmp     L32
L55:
        xorl    %eax, %eax
L57:
        popq    %rbp
        retq
        nopl    (%rax,%rax)

Again, it looks like in practice there is not much difference for the given lt. Where this may make a difference is for an arbitrary lt that the compiler cannot easily negate.

jling · May 11, 2021, 1:59am

you can’t always(?) trivially “flip it” since lt= can be any general binary function

tbeason · May 11, 2021, 2:23am

This is an interesting find. I think the easiest way to handle this might be a new keyword, since it seems the implementation does appear to favor strict comparisons. Regardless of the outcome, I think a statement on how ties are handled belongs in the docstring.

mkitti · May 11, 2021, 11:26pm

Ultimately, I think @goretkin is right that the implementation is key. Currently, the default is

if this < prev
    return false
end

The alternative would be as follows.

if !( this >= prev )
    return false
end

The second version seems less efficient since you have to do the >= comparison and then negate it. However, this is Julia, and we have a fancy compiler.

julia> function f(this, prev)
           if this < prev
               return false
           end
           return true
       end
f (generic function with 1 method)

julia> function g(this, prev)
           if !( this >= prev )
               return false
           end
           return true
       end
g (generic function with 1 method)

julia> f(1,2)
false

julia> g(1,2)
false

julia> f(2, 1)
true

julia> g(2,1)
true

julia> @code_llvm debuginfo=:none f(2, 1)
; Function Attrs: uwtable
define i8 @julia_f_816(i64 signext %0, i64 signext %1) #0 {
top:
  %.not = icmp sge i64 %0, %1
  %spec.select = zext i1 %.not to i8
  ret i8 %spec.select
}

julia> @code_llvm debuginfo=:none g(2, 1)
; Function Attrs: uwtable
define i8 @julia_g_818(i64 signext %0, i64 signext %1) #0 {
top:
  %.not = icmp sle i64 %1, %0
  %spec.select = zext i1 %.not to i8
  ret i8 %spec.select
}

Both versions compile to very similar code.

julia> @code_native debuginfo=:none issorted([1, 2, 3])
        .text
        pushq   %rbp
        movq    %rsp, %rbp
        movq    8(%rcx), %r9
        movb    $1, %al
        testq   %r9, %r9
        je      L57
        cmpq    $1, %r9
        je      L57
        movq    (%rcx), %r8
        movq    (%r8), %rdx
        movl    $2, %ecx
L32:
        movq    %rdx, %r10
        movq    -8(%r8,%rcx,8), %rdx
        cmpq    %r10, %rdx
        jl      L55
        cmpq    %r9, %rcx
        jae     L57
        incq    %rcx
        jmp     L32
L55:
        xorl    %eax, %eax
L57:
        popq    %rbp
        retq
        nopl    (%rax,%rax)

julia> import Base: issorted, lt

julia> function issorted(itr, order::Base.Order.Ordering)
           y = iterate(itr)
           y === nothing && return true
           prev, state = y
           y = iterate(itr, state)
           while y !== nothing
               this, state = y
               !lt(order, prev, this) && return false
               prev = this
               y = iterate(itr, state)
           end
           return true
       end
issorted (generic function with 5 methods)

julia> @code_native debuginfo=:none issorted([1, 2, 3], lt = <= )
        .text
        pushq   %rbp
        movq    %rsp, %rbp
        movq    8(%rcx), %r9
        movb    $1, %al
        testq   %r9, %r9
        je      L57
        cmpq    $1, %r9
        je      L57
        movq    (%rcx), %r8
        movq    (%r8), %rdx
        movl    $2, %ecx
L32:
        movq    %rdx, %r10
        movq    -8(%r8,%rcx,8), %rdx
        cmpq    %rdx, %r10
        jg      L55
        cmpq    %r9, %rcx
        jae     L57
        incq    %rcx
        jmp     L32
L55:
        xorl    %eax, %eax
L57:
        popq    %rbp
        retq
        nopl    (%rax,%rax)

Again, it looks like in practice there is not much difference for the given lt. Where this may make a difference is for an arbitrary lt that the compiler does not recognize.

kimikage · May 12, 2021, 12:06am

BTW, it looks more natural if we use ||.

qsong · May 12, 2021, 3:55am

issorted([a,b]) is same as !isless(b,a),
so it checks whether [a, b] is non-decreasing;
OTOH, issorted([a,b], lt=<=) is just isless(a,b) and to check strictly increasing.
The default lt(b,a), not lt(a,b) make the situation a little confusing.

DNF · May 12, 2021, 5:58am

It seems very strange to me that strict comparison, <, does not require “strictly increasing”.

It’s also not great that one must “look at the implementation” to understand what issorted does. The current docs seem to assume that the behavior is intuitively obvious. To me the behavior is profoundly weird.

ettersi · May 12, 2021, 6:56am

Maybe issorted() should have another keyword argument strict::Bool?

jzr · May 12, 2021, 7:01am

I don’t think so; I think it should just do the obvious thing as documented, not this weird other thing that it currently does.

sostock · May 12, 2021, 7:24am

We should definitely document that lt must define a strict total order. Aside from the documentation issue, it is not clear to me what behavior you would prefer:

On one hand, you expect that issorted(sort(A, lt=x), lt=x) returns true for all x (which is the case if x defines a strict total order).
On the other hand, you want to have a my_lt (either < or <=) such that issorted(sort(A, lt=my_lt)) returns false when A contains non-unique elements.

So what should sort(A, lt=my_lt) do? Remove non-unique elements? It seems very counterintuitive to me that sort would delete elements from the vector.

Topic		Replies	Views
Meaning of `isless`, `<`, `cmp`, `isequal`, `==`, total order, partial order, unordered Internals & Design	24	2321	January 7, 2018
Sortperm with keyword lt = !isless throws BoundsError General Usage	9	537	January 8, 2023
Behavior of == when isless or < is defined General Usage	22	2116	April 4, 2018
What should `issorted` return when an order isn't defined for the elements, but a comparison works? General Usage sort	4	232	July 10, 2023
Why is there no isless(::Int64, ::UnitRange{Int64})? General Usage range , unitrange	45	969	April 9, 2025

Understanding issorted's lt keyword

Related topics