Sorting a vector of fixed size

AdamR · November 19, 2021, 9:37am

I need to solve a combinatorics problem, that requires, among others, finding and sortperm (a.k.a. argsort) for thousands of millions of vectors of fixed size.

It so happens that the size of the vector is known at the compile time in my setup.
Since the size of the vectors is expected to be small (single digit), I expect a significant speedup when compiler is given the size of the vectors, because it theoretically can unroll the sort into a hierarchy of “if” statements.

Is this kind of optimization achievable in Julia? If so, how?

Edit: in C++ one can approach the problem like this: c++ - Very fast sorting of fixed length arrays using comparator networks - Stack Overflow

DNF · November 19, 2021, 9:59am

One approach is to use SVectors from StaticArrays.jl:

1.7.0-rc1> using BenchmarkTools, StaticArrays

1.7.0-rc1> @benchmark sort(v) setup=(v=rand(8))  # ordinary vectors
BenchmarkTools.Trial: 10000 samples with 988 evaluations.
 Range (min … max):  47.773 ns … 675.911 ns  ┊ GC (min … max): 0.00% … 90.25%
 Time  (median):     55.466 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   59.278 ns ±  22.770 ns  ┊ GC (mean ± σ):  1.54% ±  4.36%

   ▃▅▆▇██▇▆▃▁                                                  ▂
  ▇██████████▇▆▅▆▇▆▅▆▄▆▅▅▂▃▅▃▅▆▆█▇▇▇▆▅▆▆▆▅▆▅▆▆▇▇▇██▇▇▅▆▆▅▆▆▅▅▅ █
  47.8 ns       Histogram: log(frequency) by time       125 ns <

 Memory estimate: 128 bytes, allocs estimate: 1.

1.7.0-rc1> @benchmark sort(v) setup=(v=@SVector rand(8))  # static vectors
BenchmarkTools.Trial: 10000 samples with 996 evaluations.
 Range (min … max):  21.586 ns … 256.124 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     23.293 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   24.495 ns ±   6.133 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇▃▇█▂▁▂▁▁▂▂▁▁▄▁▂                                             ▂
  ████████████████▄▅▅▅▅▅▅▅▅▆▆▅▄▅▆▆▆▅▇▆▆▆▆▇▆▆▆▅▆▅▆▅▆▆▅▅▄▃▃▅▄▄▁▅ █
  21.6 ns       Histogram: log(frequency) by time      55.2 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

It’s pretty fast, but the speedup is less than what you will see from other operations, probably.

SVectors are statically sized as well as immutable, if you absolutely have to mutate the vectors, there are also MVector, but in most cases immutability is not a problem.

DNF · November 19, 2021, 10:03am

(Edit: The below benchmarks are not quite reliable, since it probably measures sorting vectors that are already sorted during the benchmark. I’m not sure how to benchmark in-place sorting of short vectors, since using evals=1 does not work quite well either. Benchmarking of sort (not sort!) is probably more reliable.)

Hmm, I tried MVector as well as sorting in-place:

1.7.0-rc1> @benchmark sort!(v) setup=(v=@MVector rand(8))
BenchmarkTools.Trial: 10000 samples with 996 evaluations.
 Range (min … max):  18.273 ns … 89.157 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     18.976 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   19.758 ns ±  3.703 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃█▇▆▃▃▁▂      ▁     ▂ ▂                                     ▂
  ██████████▇████▄█▇█▇███▇▄▅▄▄▃▄▁▁▁▃▄▄▃▁▄▁▁▃▁▁▄▄▆▅▅▅▄▄▅▅▅▄▄▅▄ █
  18.3 ns      Histogram: log(frequency) by time        37 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

1.7.0-rc1> @benchmark sort!(v) setup=(v=rand(8))
BenchmarkTools.Trial: 10000 samples with 996 evaluations.
 Range (min … max):  19.679 ns … 130.924 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     21.084 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   22.022 ns ±   5.150 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄▇█▆▄▂▁▃▁▁   ▂▃                                              ▂
  ███████████▇███▇▅▄▅▅▅▄▄▁▃▁▁▁▁▅▇▇▇▅▅▅▅▅▅▅▅▆▆▇▅▆▆▆▆▅▄▄▅▄▄▃▃▅▄▄ █
  19.7 ns       Histogram: log(frequency) by time      49.1 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Sorting in-place with sort! is actually faster than sorting SVectors, which surprises me a bit. Keep in mind, though, that SVectors have many other performance benefits that you should consider.

goerch · November 19, 2021, 10:08am

You could try to use https://github.com/JeffreySarnoff/SortingNetworks.jl or https://github.com/nlw0/ChipSort.jl

AdamR · November 19, 2021, 10:15am

These are great libraries, but none provides a sortperm/argsort, just sort.

goerch · November 19, 2021, 10:25am

There is also https://github.com/xiaodaigh/SortingLab.jl, although they don’t seem to use sorting networks…

Topic		Replies	Views
Sort StaticVector? Performance staticarrays	7	435	March 31, 2023
How to construct StaticArray? New to Julia question , staticarrays	6	1583	February 17, 2023
Performance assigning and copying with StaticArrays.jl New to Julia performance , memory-allocation , staticarrays	4	751	January 7, 2022
What is meant by this tip for StaticArrays? Performance	6	554	February 6, 2024
Squeeze out the last 10% of performance for a sorting function? Performance sort	26	3494	July 18, 2021

Sorting a vector of fixed size

Related topics