For another project of mine, I wanted a good fast hash that wasn’t trivial to mock. After an excessive amount of investigation, I’ve settled on KangarooTwelve. To quote the project readme, I think it strikes a really nice balance between:
Simplicity (Keccak + sponge + hopping)
Security (128-bit, sharing the same cryptographic base as SHA3)
Speed (up to ~2bytes/cycle, with AVX512)
With a little help from @Lilith, @Sukera, and @Oscar_Smith (big thanks for the help so far ) in #performance-helpdesk we’ve currently got what seems to be the fastest pure-julia cryptographic hash (by some margin).
That said, it seems that we’re quite some way from where we could be. By my estimation, this is currently around ~4 cycles per byte, and according to the keccak team1.4 cycles/byte is achievable with AVX2 and 0.5 cycles/byte with AVX512.
So, that’s a ~3-8x performance improvement to be had, and that’s not even considering threading (yes, this can be threaded).
I’m pretty sure I’m nearing the limit of my high-performance/micro-optimisation abilities though, and I’d love it if I might be able to interest some of the more performance/optimisation-minded folks around here to take a look and see how far we can push this!
This is , but FTR AFAIK (not an expert) K12 may be kind of obsoleted (from performance and flexibility perspectives) by newer designs. One of the most awesome things about sponge-based/permutation-based crypto is how it enables implementing the entirety of symmetric crypto with most of the code shared between the “primitives”, something that K12 is not designed for as far as I remember (it’s just a hash I think, at least “officially”).
For example, Ascon is the winner of the Lightweight Cryptography challenge (by the US NIST):
Furthermore, the Keccak Team themselves (after some iteration, including third-party work) introduced several new concepts that make sponges obsolete AFAIK, such as deck functions and Farfalle. These are not just new algorithms, but seem like they will change the way symmetric crypto is both implemented and used (new modes) for the better: Keccak Team
One interesting thing about implementing crypto primitives is that, AFAIK, safety from some side-channel attacks requires using assembly (or llvmcall, in the case of Julia, I guess). For example, ensuring that some operations are constant-time (to protect against timing attacks) as necessary sometimes requires protection from an overzealous compiler optimizer. I wonder whether there are any existing Julia projects that tackle such issues?
AFAIK the current “fastest” established general-purpose cryptographic hash is Blake3. However, at longer input lengths (what I care abount more), K12 is actually extremely competitive. To take an excerpt from the Blake3 paper:
I have seen however, those are all authenticated encryption schemes, and AFAIK no established desk/farfalle based hash function currently exists.
My understanding is that this is much more relevant when looking at crypto that involves secrets (e.g. encryption). When hashing, an attacker knows everything to start with … which is a large part of why translating a cipher to a hash is non-trivial (the secret is no longer secret).