Sort on tuple is > 4x as fast as sortrows on the equivalent array

I was a little surprised at the performance difference here:

julia> shuffle!(a)
9-element Array{Tuple{Int64,Int64},1}:
 (1, 2)
 (2, 20)
 (2, 30)
 (2, 10)
 (3, 39)
 (1, 6)
 (1, 4)
 (3, 31)
 (3, 33)

julia> s = [i[1] for i in a]; d = [i[2] for i in a];

julia> b = hcat(s, d)
9×2 Array{Int64,2}:
 1   2
 2  20
 2  30
 2  10
 3  39
 1   6
 1   4
 3  31
 3  33

julia> @benchmark(sort($a))
BenchmarkTools.Trial:
  memory estimate:  384 bytes
  allocs estimate:  3
  --------------
  minimum time:     231.093 ns (0.00% GC)
  median time:      445.286 ns (0.00% GC)
  mean time:        483.968 ns (3.47% GC)
  maximum time:     18.558 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     290

julia> @benchmark(sortrows($b))
BenchmarkTools.Trial:
  memory estimate:  1.22 KiB
  allocs estimate:  18
  --------------
  minimum time:     1.104 μs (0.00% GC)
  median time:      2.005 μs (0.00% GC)
  mean time:        2.303 μs (4.32% GC)
  maximum time:     226.557 μs (93.68% GC)
  --------------
  samples:          10000
  evals/sample:     10

I think that this is because sortrows is optimized for arrays that are bigger than nx2. For example,

n = 10
m = 10

a = rand(1:10, n, m)
b = [(a[i,:]...) for i in 1:n]

@benchmark sortrows($a)

BenchmarkTools.Trial: 
  memory estimate:  1.92 KiB
  allocs estimate:  19
  --------------
  minimum time:     1.196 μs (0.00% GC)
  median time:      1.340 μs (0.00% GC)
  mean time:        1.557 μs (6.74% GC)
  maximum time:     163.354 μs (93.21% GC)
  --------------
  samples:          10000
  evals/sample:     10

@benchmark sort($b)

BenchmarkTools.Trial: 
  memory estimate:  1.03 KiB
  allocs estimate:  3
  --------------
  minimum time:     467.066 ns (0.00% GC)
  median time:      641.760 ns (0.00% GC)
  mean time:        734.292 ns (9.24% GC)
  maximum time:     18.231 μs (88.09% GC)
  --------------
  samples:          10000
  evals/sample:     198

but if I set m = 1_000, then I get

@benchmark sortrows($a)

BenchmarkTools.Trial: 
  memory estimate:  79.25 KiB
  allocs estimate:  20
  --------------
  minimum time:     11.251 μs (0.00% GC)
  median time:      16.116 μs (0.00% GC)
  mean time:        29.179 μs (22.21% GC)
  maximum time:     4.927 ms (93.81% GC)
  --------------
  samples:          10000
  evals/sample:     1

@benchmark sort($b)

BenchmarkTools.Trial: 
  memory estimate:  78.36 KiB
  allocs estimate:  4
  --------------
  minimum time:     51.590 μs (0.00% GC)
  median time:      58.386 μs (0.00% GC)
  mean time:        73.561 μs (8.27% GC)
  maximum time:     3.661 ms (90.14% GC)
  --------------
  samples:          10000
  evals/sample:     1

So sortrows closes the gap.

Although, a 4x speed up does seem a little big…