Sorting instability issue

PanT12 · December 10, 2024, 1:07pm

Hello, everyone, I work with Julia Version 1.10.2 and want to ask a question about the sorting function in Julia.

Suppose I run the following code:

time_sort = zeros(100,1);
for i = 1:100
n = 10^6
x0 = rand(n);
time_sort[i] = @elapsed x0sort = sort(x0, rev=true);
end

The result shows that
julia> time_sort
100x1 MatrixfFloat64}:
0.101048458
0.120940708
0.108399541
0.097998959
0.090954625
0.096979917
0.1142765
0.099838583
0.097053875
0.092064458
…
0.009552855
0.009860167
0.009679833
0.009623333
0.009937875
0.009491167
0.0097165
0.009016333
0.009795333
0.009923958

The first 20 elements in time_sort are about 0.1s. However, the final elements are about 0.01s. The time difference between these two is about 10 times, while they are running the same code. I am wondering why this happens. Does someone know that?

nilshg · December 10, 2024, 1:15pm

I can’t reproduce this:

julia> using UnicodePlots

julia> scatterplot([@elapsed sort(rand(10^6), rev=true) for _ ∈ 1:100])
         ┌────────────────────────────────────────┐
   0.046 │⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠠⠀⠀⠀⡀⢀⠂⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⠀⠀⠀⠀⠀⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⠀⠀⠀⠁⠀⠀⠈⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         │⠀⡀⠈⠀⠀⠄⠀⡀⠂⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠠⠀⠀│
         │⠀⢀⡀⠁⠠⠀⠀⠀⠠⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠁⠀⠀⠀⠄⡄⢀│
         │⠀⠀⠀⠠⠀⠀⠈⠀⠀⠁⠔⢂⠄⡐⣄⠤⡔⡀⣀⠔⡔⢁⠤⠔⠒⠊⡄⢄⠤⠄⢄⡀⠌⠠⠕⠑⠌⠀⠐⢄│
   0.037 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠁⠀⠉⠀⠀⠀⠀⠈⠀⠀⠀⠀⠀⠀⠀⠀⠈⠀⠁⠈⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
         └────────────────────────────────────────┘
         ⠀0⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀100⠀

There’s slightly higher runtime for the first execution and some general noise later on, which is why people generally benchmark with BenchmarkTools or Chairmarks to get more reliable results.

Also,

Any specifice reason for this? Current release is 1.11.2, and LTS is 1.10.7, so there isn’t really a reason to be on 1.10.2.

PanT12 · December 10, 2024, 1:38pm

I have no idea about this. I can’t reproduce the strange result now. Now when I run that code again, it is basically around 0.01s, instead of 0.1s.

By the way, what is the default algorithm for sorting in Julia? I found that it would cost about 0.1s if I use mergesort or quicksort:

sort(x0, rev=true, alg=MergeSort) or sort(x0, rev=true, alg=QuickSort)

However, the default algorithm is much faster than quicksort or mergesort. When I check Julia documentation, it seems that " Currently, a hybrid of RadixSort , ScratchQuickSort , InsertionSort , and CountingSort is used based on input type, size, and composition." So the default algorithm is a hybrid of these four algorithms?

PanT12 · December 10, 2024, 1:49pm

No specific reason for choosing this version. I downloaded Julia last year and chose this version.

nilshg · December 10, 2024, 2:13pm

Probably not a problem either, but for convenient management of multiple Julia versions including keeping an up-to-date installation it’s recommended to use juliaup:

PanT12 · December 10, 2024, 3:20pm

Thanks for your advice! It may be a problem since I want to compare my algorithm with the sorting algorithms in Julia. Everything seems fine if I set the sorting algorithm to be MergeSort, QuickSort, or other sorting algorithms. However, the default algorithm in the sort function is much faster than other algorithms… It makes me confused about that.

nilshg · December 10, 2024, 3:30pm

Sorry I meant using Julia 1.10.2 is likely not a problem, although if you are interested in the very granular details of sort maybe it is, as the actual implementation of the default sort are an implementation detail.

Have you looked at the extended help as advised in the manual?

help?> ?Base.DEFAULT_STABLE
  │ Warning
  │
  │  The following bindings may be internal; they may change or be removed in future versions:
  │
  │    •  Base.DEFAULT_STABLE

  DEFAULT_STABLE

  The default sorting algorithm.

  This algorithm is guaranteed to be stable (i.e. it will not reorder elements that compare equal). It makes an effort to be fast for most inputs.

  The algorithms used by DEFAULT_STABLE are an implementation detail. See extended help for the current dispatch system.

  Extended Help
  ≡≡≡≡≡≡≡≡≡≡≡≡≡

  DEFAULT_STABLE is composed of two parts: the InitialOptimizations and a hybrid of Radix, Insertion, Counting, Quick sorts.

  We begin with MissingOptimization because it has no runtime cost when it is not triggered and can enable other optimizations to be applied later. For example, BoolOptimization cannot apply to an
  AbstractVector{Union{Missing, Bool}}, but after MissingOptimization is applied, that input will be converted into am AbstractVector{Bool}.

  We next apply BoolOptimization because it also has no runtime cost when it is not triggered and when it is triggered, it is an incredibly efficient algorithm (sorting Bools is quite easy).

  Next, we dispatch to InsertionSort for inputs with length <= 10. This dispatch occurs before the IEEEFloatOptimization pass because the IEEEFloatOptimizations are not beneficial for very small inputs.

  To conclude the InitialOptimizations, we apply IEEEFloatOptimization.

  After these optimizations, we branch on whether radix sort and related algorithms can be applied to the input vector and ordering. We conduct this branch by testing if UIntMappable(v, order) !== nothing.
  That is, we see if we know of a reversible mapping from eltype(v) to UInt that preserves the ordering order. We perform this check after the initial optimizations because they can change the input vector's
  type and ordering to make them UIntMappable.

  If the input is not UIntMappable, then we perform a presorted check and dispatch to ScratchQuickSort.

  Otherwise, we dispatch to InsertionSort for inputs with length <= 40 and then perform a presorted check (CheckSorted).

  We check for short inputs before performing the presorted check to avoid the overhead of the check for small inputs. Because the alternate dispatch is to InsertionSort which has efficient O(n) runtime on
  presorted inputs, the check is not necessary for small inputs.

  We check if the input is reverse-sorted for long vectors (more than 500 elements) because the check is essentially free unless the input is almost entirely reverse sorted.

  Note that once the input is determined to be UIntMappable, we know the order forms a total order (wikipedia.org/wiki/Total_order) over the inputs and so it is impossible to perform an unstable sort because
  no two elements can compare equal unless they are equal, in which case switching them is undetectable. We utilize this fact to perform a more aggressive reverse sorted check that will reverse the vector [3,
  2, 2, 1].

  After these potential fast-paths are tried and failed, we ComputeExtrema of the input. This computation has a fairly fast O(n) runtime, but we still try to delay it until it is necessary.

  Next, we ConsiderCountingSort. If the range the input is small compared to its length, we apply CountingSort.

  Next, we ConsiderRadixSort. This is similar to the dispatch to counting sort, but we consider the number of bits in the range, rather than the range itself. Consequently, we apply RadixSort for any
  reasonably long inputs that reach this stage.

  Finally, if the input has length less than 80, we dispatch to InsertionSort and otherwise we dispatch to ScratchQuickSort.

Oscar_Smith · December 10, 2024, 3:39pm

For large arrays of Float64 (and many other types), Julia uses a RadixSort.

PanT12 · December 10, 2024, 3:42pm

I see. Thanks for your help! The simulation result coincides with the statement.

PanT12 · December 10, 2024, 3:44pm

Thank you very much! I will carefully check this manual.

abraemer · December 10, 2024, 9:50pm

If you are interested in sorting things FAST then I cannot recommend @Lilith’s Juliacon talks enough. Starting with this overview at last years JuliaCon

There were a couple of other shorter talks by her as well on sorting.

PanT12 · December 11, 2024, 1:59am

Thank you for providing the video! It is great for learning the sort function in Julia.

Topic		Replies	Views
Why does Base only have Merge Insertion and Quick sorting algorithms? Internals & Design sort	16	4429	August 6, 2021
Faster default Base.sort! for Julia Performance sort	4	1045	July 29, 2021
Ironic observation about `sort` and `sortperm` speed for "small integers" vs R Performance sort , sortperm , r	32	4666	February 4, 2018
Should we make radix sort the primary option for sorting? Internals & Design	2	275	April 9, 2024
Move Random into Base? Internals & Design question , sort , base , random , stdlib	22	3597	May 9, 2022

Sorting instability issue

Related topics