On a machine with AVX2 I can confirm
numpy.sort
is faster (but definitely not 10x):
In contrast,
numpy.sort
seems about 10x slower than Julia for 108 floats on my Apple M3 laptop:
This is because I was optimizing sorting while developing on an apple machine without AVX2. If we had cross-platform benchmark infrastructure in CI, Julia would likely be 10x faster in both cases.