Would it be possible to update the documentation to include how to implement isequal() and hash()

Both the C# and Python documentation include clear instructions on how to implement equals and hash code methods and I was wondering if we could do the same for Julia.

For example, the Base.hash() documentation says:

New types should implement the 2-argument form, typically by calling the 2-argument hash method recursively in order to mix hashes of the contents with each other (and with h ).

It might be helpful to see an example on how to implement the 2-argument form that the documentation talks about.

Same thing with Base.isequal(), the documentation shows how to use it, but not how to implement for a custom type.

For example, we could create a struct that represents a Person:

struct Person
    first_name::String
    last_name::String
    Age:: Int32
end

To override equals we can do:

import Base.==

function ==(x::Person, y::Person)

    return (x.first_name == y.first_name) && 
           (x.last_name == y.last_name) && 
           (x.age == y.age)
end

Not sure if I did that right, but a correct example like this might be helpful.

6 Likes

I think you can find correct examples in AutoHashEquals, a package that provides a macro for automatically implementing == and hash for structs.

1 Like

Just out of curiosity, did I implement my equals method correctly? I figured it out by reading the documentation of C# and Python, understanding the logic, and attempting to do the same in Julia.

I was reading this link about implementing Base.hash(), but I’m not sure if it’s done correctly.

Seems correct to me. If it was only for style, I would break that long single-liner.
You could also write a generic comparison for the case you add fields to Person:

 function ==(x::Person,y::Person)
   for field in fieldnames(Person)
     if getfield(x,field) != getfield(y,field)
       return false
     end
   end
   return true
 end
1 Like

isequal and == are two different functions. Which one are you talking about?

2 Likes

I believe it is correct. @lmiq gave a good generic implementation. In fact, you can use such implementation for any composite type for which equality is the same as the equality of all its fields. If you care about the performance of the equality, I would suggest writing it by hand (like you did) and ordering the comparisons in a way that puts first the fields that have the largest probability of being different when the objects are different (or the cheapest to compare, like Int fields, if all of them have similar distributions).

1 Like

I wouldn’t recommend to use @leandromartinez98 proposal in a tight loop though, since it suffers from type instability. It is related to the fact, that getfield(x, field) is type unstable.

struct Person
    first_name::String
    last_name::String
    Age::Int32
    X::Bool
    Y::Int
    Z::Float64
end

function Base.:(==)(x::Person,y::Person)
    for field in fieldnames(Person)
        if getfield(x,field) != getfield(y,field)
            return false
        end
    end
    return true
end

p1 = Person("Foo", "Bar", 10, true, 5, 0.35)
p2 = Person("Foo", "Bar", 10, true, 5, 0.29)

julia> @code_warntype p1 == p2
Variables
  #self#::Core.Const(==)
  x::Person
  y::Person
  @_4::Union{Nothing, Tuple{Symbol, Int64}}
  field::Symbol

Body::Bool
1 ─ %1  = Main.fieldnames(Main.Person)::Tuple{Vararg{Symbol}}
│         (@_4 = Base.iterate(%1))
│   %3  = (@_4 === nothing)::Bool
│   %4  = Base.not_int(%3)::Bool
└──       goto #6 if not %4
2 ┄ %6  = @_4::Tuple{Symbol, Int64}::Tuple{Symbol, Int64}
│         (field = Core.getfield(%6, 1))
│   %8  = Core.getfield(%6, 2)::Int64
│   %9  = Main.getfield(x, field)::Any
│   %10 = Main.getfield(y, field)::Any
│   %11 = (%9 != %10)::Any
└──       goto #4 if not %11
3 ─       return false
4 ─       (@_4 = Base.iterate(%1, %8))
│   %15 = (@_4 === nothing)::Bool
│   %16 = Base.not_int(%15)::Bool
└──       goto #6 if not %16
5 ─       goto #2
6 ┄       return true

And

using BenchmarkTools
f(p1, p2) = p1 == p2

julia> @btime f($p1, $p2)
  1.264 μs (15 allocations: 672 bytes)

So it is always better to write it by hand. But good news is that you rarely need it for immutable structures, since default equality and hash is good enough.

1 Like

Nice, good to know.

But this generic approach is a good application of macro, because this is exactly case of dumb copypaste

macro equality(T, x, y)
    q = :()
    for field in fieldnames(getfield(@__MODULE__, T))
        q = quote
            $q
            if $(esc(x)).$field != $(esc(y)).$field
                return false
            end
        end
    end

    return q
end

function Base.:(==)(p1::Person, p2::Person)
    @equality Person p1 p2
    return true
end

julia> @btime f($p1, $p2)
  8.713 ns (0 allocations: 0 bytes)
1 Like

Seems like we have done the full circle, as my first comment in this thread links to a package that provides a macro for implementing == and hash automatically.

4 Likes

Nice :slight_smile:

2 Likes

You can see all the methods with methods(Base.hash), and just pick one with @edit or similar. Eg

@edit hash(1 => 2, UInt(0))

will take you to the method for Pair (your IDE may just support clicking on the result ofmethods directly).

Generally, the Julia source has plenty of examples of these kind of functions, it is worth investing into learning how to navigate it. While examples could be extracted into docstrings, real-life examples should be at least as good as toy ones.

3 Likes

Despite the problem mentioned above, I think it’s still pretty interesting way to solve the problem. Perhaps getfield() will one day be stable for types and we can use a for loop.

This is one thing I admire about Julia, even if the documentation isn’t clear, the answer might be somewhere in the ecosystem, and if not, you can ask a question on here and get a quick reply.

This is what frustrated me about Python, while on the surface the documentation seems okay, it’s surprising lacking when you dig deeper. For example, the documentation contains no example of multiple inheritance, some StackOverflow answers are incorrect, and you have to read a blog post by one of the developers to see a basic example of multiple inheritance (which is itself lacking, because it only inherits from one class).

2 Likes

No it wouldn’t, for the very simple reason. Type stability means that for the same type of the arguments, output has the always same type. Input arguments of getfield(x::T, field::Symbol) has types of the object (which I denoted here T) and Symbol - name of the field. But it’s output type is any possible type that can be encountered inside struct, so this function is type unstable by design.

2 Likes