Nope! C ain’t an insurmountable barrier like the speed of light is. It’s not necessarily able to natively express the fastest possible implementation for some algorithms. But, similarly, Julia might not be able to, either.
They’re all just languages that are trying to give you the ability to express (and then compile to) the fastest set of instructions for a given architecture with varying levels of success.
I’m still trying to figure out how to better express this implementation in a way that’s both more generic and more easily (and perhaps even documentedly?) specializable. This may yet change in bigger ways.