That can be a nice micro optimization.
It looks like you’ve been bitten by the sorting bug. For more on sorting algorithms, see this thread. I posted 4 links you might find of interest.
At the bottom of the referenced link, I post a couple of links to algorithms that use SIMD and multiple threads. That would be the ultimate sort right now for primitive types. The way to get such a sort incorporated into base would be to first release a package with it. After a while, once it’s been debugged, and lots of people are finding it useful, there is a chance it could be pulled into base.
Or … maybe it just stays in a package. Packages work great. Although it can be tedious to load two dozen packages and re-add them for Julia releases. I can’t say I know of a better alternative. Package compiler can help somewhat.
It also depends on the philosophy of Julia. I created a post asking about that, but it didn’t get much traction. Do we want everything to be eventually SIMD and multithreaded? Or do people get annoyed when functions launch threads and prefer to have single threaded, unless they ask otherwise?