Behavior of == when isless or < is defined

sambitdash · April 2, 2018, 9:02am

Hi All,

I will like to understand the behavior of == of an immutable type when < is defined. When strict ordering exists in a system, == is not mandated.

For example:

==(a, b) =  !(a < b) && !(b < a)

Is it how Julia treats the immutable types. I believe the default implementation is:

==(a, b) = a === b

regards,

Sambit

Tamas_Papp · April 2, 2018, 2:04pm

Sure, but how would Julia know that a strict total order is supposed to exist for a type? Eg,

mutable struct Foo
    x::Int
end

Base.:<(a::Foo, b::Foo) = a.x < b.x
Base.:>(a::Foo, b::Foo) = a.x > b.x

and then

julia> Foo(1) < Foo(2)
true

julia> Foo(2) > Foo(1)
true

julia> Foo(1) == Foo(2)
false

because figuring out that the method < is a total order would, in general, require a mathematical proof, not something that the compiler is equipped to do.

If you want == for signatures not handled by fallback methods, simply define it. Cf

sambitdash · April 2, 2018, 2:52pm

@Tamas_Papp my understanding is when < is defined > should be by default b < a. So that should be default.

May be there should be a trait called ‘Ordered’ for which only < should be defined. == and > should be derived. < and > can have other symbolic notations for override but default for Ordered should mandate only < is defined.

y4lu · April 3, 2018, 2:02am

There can be odd cases for custom isless functions though, eg. b.y acting as a positive fuzziness parameter

Base.:<(a::Foo, b::Foo) = a.x < (b.x + b.y)
julia> Foo(1,5) < Foo(2,5)
true

julia> Foo(2,5) < Foo(1,5)
true

==(a::Foo, b::Foo) = (a < b) == (b < a) might be better in this case

chakravala · April 3, 2018, 2:11am

Total ordering is not required. I made a package that implements the Tamari partially ordered set

https://github.com/chakravala/Dendriform.jl/blob/master/src/poset.jl

Simply defined < and >, separately, and comparing any binary tree elements satisfies the partial ordering

So one can certainly work with non-total orderings too.

sambitdash · April 3, 2018, 4:06am

Should the default be what is common and intuitive or exceptions exists hence nothing should be defined and should be left to the developer. My point of view is one can define all the operators independently, but that should not be default. Default when Ordered trait is defined only one operator should be defined it can be either of < or > and not both.

StefanKarpinski · April 3, 2018, 4:25am

Can you think of a reason that might not be such a good idea?

sambitdash · April 3, 2018, 4:33am

@StefanKarpinski only one reason I can think of is efficiency. As == will need 2 computations. But, now every time you code you need to think about ensuring consistency exist btw <, > , == and hash as well which is quite cumbersome as well.

StefanKarpinski · April 3, 2018, 4:46am

I do think it’s surprisingly rare for that fallback definition to be both correct and efficient. Integers may be the only case I can think of where the behavior is simple enough and compilers are smart enough to turn the fallback into a correct and efficient implementation.

jandehaan · April 3, 2018, 5:07am

I know that some people believe that sour < salty < sweet doesn’t always imply sour < sweet. It depends on what kind of minerals your body is craving.
But should that preclude Julia from insisting that !(a < b) && !(b < a) implies a == b?

jandehaan · April 3, 2018, 5:31am

There are many systems of logic. Examples include propositional logic, description logic, first-order logic, fuzzy logic, modal logic. (Logic - Wikipedia)
Maybe somebody should develop a package that defines the meaning of operators like <, <=, ==, ===, >=, >, etc. based on the type of logic the user want to apply.

sambitdash · April 3, 2018, 5:43am

@jandehaan 0.7 has concept of traits implements in parts. So a StrictOrdered trait can be defined to specify that ordering. There is no need to define a complete package for the same. May be RangeOrdered trait can be defined for a partial ordering case etc.

Tamas_Papp · April 3, 2018, 6:00am

I share the skepticism of @StefanKarpinski about this being a good idea — programming is not math, and this is not how one would usually build functionality for an ordered type. The cost of a == is usually comparable to a <, and your approach would double this. An approach that works well in many cases is Base.cmp, with the user defining a single function that serves as a basis for all comparisons.

That said, trait-based approaches could have some value. But this is something that can be hammered out in a package first, using one of the nice Unicode relation symbols instead of < etc.

sambitdash · April 3, 2018, 6:36am

@Tamas_Papp you have a very valid point. In that case, can the following be the failover methods.

Base:(<)(a, b)  = cmp(a, b)  < 0
Base:(==)(a, b) = cmp(a, b) == 0
Base:(>)(a, b)  = cmp(a, b)  > 0

All the developer has to define cmp(a, b), than think about ensuring fallbacks are consistent every time he has to overload one of them. Definitely, advanced developers have to think of better fallbacks.

Note: Here also the last computation will exclude the first two computation cases. Hence, needs two comparisons. But, if == has to be made more efficient may be that should be the first in the 3 conditions to be defined in cmp.

Tamas_Papp · April 3, 2018, 6:51am

This is how it works for certain types, and the other way round (cmp defined from isless) for others (which is also the general fallback).

I don’t understand this. If cmp can be implemented efficiently, then == above calls it once.

Which approach is more efficient usually depends on the internals of the type. I think that the sanest way to do this is to provide a reasonable fallback that just works, but may not be optimal in all cases, and then specialize the methods if necessary. This is what Base currently does.

sambitdash · April 3, 2018, 6:59am

@Tamas_Papp if you code requires == to be hit maximum no of times followed by < the >

function cmp(a, b) 
      eq(a, b) && return 0
      lt(a, b) && return -1
      return 1
end

In the above case > will always take the beating. Technically, two operations will definitely be twice as slower than one. If cmp is implemented with isless it definitely does what I have been suggesting. Just that the fallback for == should not be to ===.

Other than integers as @StefanKarpinski pointed earlier. As registers may set SF and ZF and that may be combined picked up by one instruction. My knowledge of CPUs has not been upgraded in years. So I better be corrected here

Tamas_Papp · April 3, 2018, 7:19am

I am sorry, but I don’t understand what you are saying.

That is an orthogonal point. Equality is a meaningful comparison even for objects which don’t have an order.

sambitdash · April 3, 2018, 8:18am

@Tamas_Papp I am not knowledgeable in this but what I know of runtime branch optimizing compilers they would optimize the branch which will be most commonly computed to the beginning to reduce the no of comparisons. As a practice developers who would not trust much on compiler will place the comparison which is most commonly carried out operation at the beginning. But any branch optimization will penalize the least occurring branch to the most commonly occurring branch or do away will branching altogether in cases where possible like the integer case may be.

The maths of ordered equality and unordered equality must match in cases where order is defined. In Julia the fallback for a <= b is (a < b) |( a == b) but this is not same as: !( a > b) in cases where === is not properly defined. I got into this kind of a case in a package which triggered all these analysis. Do not want to get the specifics here as that may not of generic interest.

Thanks a lot for pointing me to the correct source code. I got the answer I was looking for.

sambitdash · April 4, 2018, 2:10am

Lastly, I will like to highlight the fallback implementation difference from v0.6 to 0.7. This may affect some of the data structure related packages:

v0.6 implementation:

<=(x, y) = !(y < x)
==(x, y) = x === y

v0.7 implementation:

<=(x, y) = (x < y) | (x == y)
==(x, y) = x === y

This I believe needs a mention is the news. 0.6 implementation has inherent strong ordering assumption. 0.7 implementation may be more accurate but expects a sane implementation of == or hash from the developer for every structure where comparison is carried out. So code which was working perfectly in 0.6 suddenly started bringing in logical errors in 0.7 in the test cases.

https://github.com/JuliaLang/julia/blob/master/NEWS.md

Tamas_Papp · April 4, 2018, 7:18am

Note that === is a built-in (so users don’t need to implement it, actually, they can’t even override/extend it), and implements

identical, in the sense that no program could distinguish them

I don’t know what you mean by “sane implementation” in this context. If you find that === does not behave according to the above definition in some cases, that is a bug.