Just to make explicit the idea behind the previous answers: you are free to do anything with hash as long as the invariant “a == b implies hash(a) == hash(b)” is maintained. So you can define hash to return a constant number, this is very fast to compute but will result in collisions (this is inefficient) when objects are stored in a Set. So the idea is to find a tradeoff such that hash is reasonably fast while limiting the number of collisions (i.e. we want hash(a) != hash(b) when a != b as much as possible).
I use hash(Node) as keys to my Dict, which is of the form Dict{UInt64, Int64}. My dictionary gets gigantic and I was hitting RAM issues so I resorted to using the hash() of the Nodes as keys instead, which works for my use case. However, this means that everytime I add to the dictionary I have to hash() my new Node, and likewise for everytime I want to look up a value.
Thank you for the excellent suggestion – but my understanding of IdDict is that it uses the === operator i.e that you want to have dict by with unique keys by object identity instead of value equality, which is not the case for me. My Node objects are different but I want uniqueness to be defined by value.
Thank you so much for everyone’s help! What ended up working for me was @tisztamo 's suggestion - storing the hash of the Node struct as a field inside Node itself, and updating it when the values changed.
I am also using a Set to keep my Nodes and this solution entailed changing the hashindex function in Dict (because Sets are implemented as Dicts under the hood) from this:
I think there is no simple and general solution for this, at least if you want to allow manipulating the content from the “outside”, or you have a deeply nested structure.
But in this concrete case it seems possible to create a custom array type with overloaded setindex!, that either notifies the container to invalidate the hash-cache, or calculates the diff of the hash and updates the cache - which one is better depends on updating patterns, I think.