DataType hash differs per patch version and OS

I noticed hash(Float32) returns different value on each julia version.

While other hash functions return same hash for same value, e.g.

julia> hash("a")

is same for Julia versions 1.3.1, 1.5.0, 1.5.1, 1.5.2.
But for hash(Float32) it’s different for each version, it’s returning different things, these are values are observed.

julia> hash(Float32)
julia> hash(Float32)
julia> hash(Float32)
julia> hash(Float32)
julia> hash(Float32)

is this intended, or a bug?
And if it’s intended, what is the reason why this happens?

And also, why is it different for windows and linux?

This isn’t a bug. Hashing had changed over time for some types to become faster.

1 Like

I see, that makes sense. But why is it different per OS?

Oh. I missed that part of it. That might be a bug.

A hash is not a checksum. If you require that kind of stability across versions and OS, try using SHA and the functions provided by that Base package.

If those functions differ between OS, that’s definitely a bug.

The reaseon hash(Float32) differs greatly is because of the way generic hashing is implemented. It falls back to hashing an internal identifier of the object, which can change with each version, different OS and even build from the source code:

In hash(x) at hashing.jl:18
>18  hash(x::Any) = hash(x, zero(UInt))

About to run: (hash)(Float32, 0x0000000000000000)
1|debug> s
In hash(x, h) at hashing.jl:23
>23  hash(@nospecialize(x), h::UInt) = hash_uint(3h - objectid(x))


Get a hash value for `x` based on object identity. `objectid(x)==objectid(y)` if `x === y`.
objectid(@nospecialize(x)) = ccall(:jl_object_id, UInt, (Any,), x)
1 Like

This is not a reason for the instability. jl_object_id IS stable for many other types. The hash for datatype changes between build because it was designed to be so (it includes the build time of the module). It’s not a bug but could be made better to reduce build variations but there’s at least currently no guarantee on returning the same value on different builds.

1 Like

Oh, so different hash for datatype for windows and linux is because of different build times?

Hashes for Types in general are just object IDs which are essentially arbitrary.

They are not any more arbitrary than other types of hash.

1 Like