Use of hash code in integer hashing

BioTurboNick · June 1, 2022, 5:44pm

I’m curious about the design of the integer hashing functions. Someone in my company coming from Python was annoyed to discover that the second argument to hash doesn’t produce very different results for incremented values of h, and there’s at least one package he was trying to use (bloom filters in Probably.jl) that assumes they would be.

julia> hash(0x000000000796a326, UInt64(0))
0x574cf859055c7b75

julia> hash(0x000000000796a326, UInt64(1))
0x574cf859055c7b72

julia> hash(0x000000000796a326, UInt64(2))
0x574cf859055c7b6f

julia> hash(0x000000000796a326, UInt64(3))
0x574cf859055c7b6c

# from hashing.jl:
hash(x::Int64,  h::UInt) = hash_uint64(bitcast(UInt64, x)) - 3h
hash(x::UInt64, h::UInt) = hash_uint64(x) - 3h

I’d like to be able to explain this design decision to him, and perhaps expand the documentation to warn about using the hash function that way.

BioTurboNick · June 1, 2022, 9:32pm

On pondering, I’m thinking the intent is that h is supposed to be the output of a previous hash function call? (hence, calling it a hash code)

stevengj · June 1, 2022, 9:36pm

Yes, it’s for mixing multiple hashes together.

julia> hash(0x000000000796a326, hash(0))
0xefde128c35091bc5

julia> hash(0x000000000796a326, hash(1))
0x43ed831bde9d910b

StefanKarpinski · June 2, 2022, 5:49pm

It kind of seems like the - 3h could go inside the call to hash_uint64 Instead of outside. You want to make sure the function is asymmetrical in the two arguments but the factor of -3 ensures that already. The only down side I can see is that it could make it easier to craft an input that interacts badly with given hashes. Would be good to look at the history of this definition.

Topic		Replies	Views
Why does my hash function return a different value to the built in hash function? General Usage	7	667	March 16, 2022
Why must the second argument of `hash` come from another hash function? General Usage question , hash	5	214	January 15, 2025
What is the main purpose of this hash function in the following code? General Usage	2	516	July 21, 2019
Hash(1) change between 1.5 and 1.6 General Usage	7	526	February 28, 2022
Properly implement Base.hash() for custom type follow up General Usage	11	1130	June 1, 2021

Use of hash code in integer hashing

Related topics