Understanding struct hashes

When i change a dictionary the hash changes.

julia> d1 = Dict{Any,Any}()
julia> hash(d1)
0x172af099a2a3a421

julia> d1[:A] = 13245

julia> hash(d1)
0x9dee08cef4972aa5

If I make a struct with a dictionary the hash of the struct doesn’t change when i change the dictionary.

struct DD
   d::Dict{Any,Any}
end
a = DD(Dict())
oh = hash(a)
a.d[:new] = 3
oh == hash(a) #true

But I can still use a dictionary as a key in another dictionary!

d1 = Dict{Any,Any}(:a=>3)
d2 = Dict(d1=>55)
d1[:b] = rand(3)
d2

How does this black magic work?

by default hash(::Any) just uses objectid. This probably isn’t what you want.

1 Like

Based on this post it surprises me that the hash of a doesn’t change when it’s contents change. I guess structs are not treating the same way as other objects.

There isn’t a thing that I want, except to understand how a dictionary can use another dictionary as a key even though it is mutable.

Using mutable values as a key is a really bad idea. If you change the value once it’s in the table, it will be lost.

2 Likes

Did you run the code? I expected , as you write, that the contents would be “lost”. But they are not. Somehow the outer dictionary keeps track of the updates to the dictionary used as key.

when you wrap in a struct, it doesn’t get lost when you mutate the dict, but it will get lost if you try to index with a copy.

You have just stopped testing one step too soon:

julia> d2[d1]
ERROR: KeyError: key Dict{Any,Any}(:a => 3,:b => [0.8116841574631843, 0.6061759816722163, 0.7968614576661504]) not found
Stacktrace:
 [1] getindex(::Dict{Dict{Any,Any},Int64}, ::Dict{Any,Any}) at ./dict.jl:467
 [2] top-level scope at REPL[5]:1

@Henrique_Becker Ah! Thank you.

The reason i stopped there is because of when inspecting d2 after changing d1 it appears that d2 knows that d1 changed. And what is even stranger (to me) is that if you extract the key from d2 it is indeed d1. But amazingly getting the key and then using it gives a key error.

julia> d1 = Dict(:a=>4)
Dict{Symbol, Int64} with 1 entry:
  :a => 4

julia> d2 = Dict(d1=>5)
Dict{Dict{Symbol, Int64}, Int64} with 1 entry:
  Dict(:a=>4) => 5

julia> d1[:b] = 2345
2345

julia> d2 # It appears in the output that the key to d1 is correctly updated?!?!
Dict{Dict{Symbol, Int64}, Int64} with 1 entry:
  Dict(:a=>4, :b=>2345) => 5

julia> d2[d1]
ERROR: KeyError: key Dict(:a => 4, :b => 2345) not found
Stacktrace:
 [1] getindex(h::Dict{Dict{Symbol, Int64}, Int64}, key::Dict{Symbol, Int64})
   @ Base ./dict.jl:482
 [2] top-level scope
   @ REPL[6]:1

julia> k2 = [k for k in keys(d2)]
1-element Vector{Dict{Symbol, Int64}}:
 Dict(:a => 4, :b => 2345)

julia> k2[1] == d1
true

julia> k2[1] === d1
true

The hash was left in a invalid state because mutating a key is not allowed by the informal contract. Everything after it is unexpected behavior. If you have already implemented a hash you will see that it makes sense with the default implementation. The hash has a mechanism to traverse all keys that is independent from the keys hash value (i.e., the key changing value and, therefore, of hash does not affect this mechanism); on the other hand, when the key was stored, it was stored in a position dependent of their hash at the time, if you change the key and its hash, then the Dict will try to find it in the wrong place and conclude such key is not stored in the Dict, even if it is store, just in the wrong place based in its new hash (in fact, there is a very small chance that, by sheer luck, it is the same place, so you are not every guaranteed this will throw an error).

1 Like

It’s an interesting situation. I’m curious about the design decisions. Julia could just throw an error when using a dictionary (or list or any mutable object) as a key. I guess another way to ask this is: why is this an “informal” rather than a “formal” contract? Just to allow flexibility?

First, Julia does not exactly have formal contracts. By this I mean, there is no feature of the language designed to enforce interface contracts. You can check if the object is mutable and throw an error, but this leads us to the second point: what the informal contract prohibits is to use a mutable key and then mutate it. You can, and sometimes is very useful, use mutable objects as Dict keys. You just should not mutate them while they are keys. And there is no way for the Dict object to be notified if you do this and throw an error when it happens. So yes, it is to allow flexibility, but in fact, what exactly is prohibited cannot be easily checked programmatically, thus the interface is informal.

1 Like

Out of curiosity: What is the rationale for this in the case of a struct? I would find it more useful to combine the hashes of the fields of the struct, similarly to what is done for an Array. I usually end up coding this myself whenever I need a hash for a struct.

The default hash method matches the default equality method which are both designed to be the simplest plausible ones. Arguably, we should have made equality of structs compare their fields, and do the same for hashes by default, but it’s too late to change that now.

FYI, This whole thread started because I was using a struct instance as a key in a dictionary and was surprised it worked. The struct is for an RL agent that has some fixed properties (like learning rate) but some mutable properties like current value associated with each state. I thought I would have to add a unique Id on construction and then refer to that Id as a key.