Use of MurmurHash3 for hashing strings

A lot of languages have adapted their default hash algorithms to address concerns about hash flooding. As I understand it, the argument is basically “better safe than sorry” when it comes to the default (since sometimes a library’s hash table might get used in unexpectedly sensitive places), coupled with the difficulty of using a non built-in hash in many high-level languages. Some relevant discussions from other languages:

Julia is in a somewhat different position than several of these languages, however, in that you can swap your own hash function into Julia’s Dict type without any performance cost. Whereas in something like CPython the hash function is embedded in the C implementation and it’s not possible to use a different hash without either sacrificing performance, writing a huge pile of C code, or recompiling Python itself.

Even so, there is still a valid argument about coding defensively when writing library/package code.

6 Likes