Sortperm slow test case and fix suggestion

is this a bug?

a = vcat(zeros(Float32,1_000_000),ones(Float32,1_000_000));

@time sortperm(a,alg=QuickSort);
0.049065 seconds (8 allocations: 304 bytes)

a = vcat(ones(Float32,1_000_000),zeros(Float32,1_000_000));
@time sortperm(a,alg=QuickSort);
133.768817 seconds (8 allocations: 304 bytes)
function lt(p::Perm, a::Integer, b::Integer)
    da = p.data[a]
    db = p.data[b]
+     lt(p.order, da, db) # | (!lt(p.order, db, da) & (a < b))
-     lt(p.order, da, db) | (!lt(p.order, db, da) & (a < b))
end

comment out the rest of the line and it goes super fast

Welcome to the forums! It’s a bit tough to read the code in your post, please quote your code with 3 backticks at the top and bottom of code blocks or press the button that looks like </>. Compare

function foo(speech)
println(“this doesn’t look good”)
end

to

function bar(code)
    println("ahh, much better")
end

See also: PSA: make it easier to help you

3 Likes

Thanks for editing! Out of curiosity, have you run tests with that change to see what breaks? I don’t know what that line does, but I’d be surprised to learn it serves no purpose.

two types of tests fail (1) those that test for stability in stable algos (merge,…) and (2) some that seem to want stability from quicksort, i.e.

alg = QuickSort
p = sortperm(v, alg=alg, rev=rev)
@test p == sortperm(float(v), alg=alg, rev=rev)

let’s back up. my “code patch” above is trying to draw attention to where I think a problem may be, which is the combination of { sortperm, quicksort, certain data with highly repeated values }. I’m not smart enough to figure out why changing the lt(p::Perm…) makes it faster, or all of the ways Perm is used in other algos, but I do know my application is spending a lot of time sortperming

If data is extremely repeated, you can use PooledArrays to optimize storage and performance. For example:

julia> using PooledArrays

julia> a = vcat(ones(Float32,1_000_000),zeros(Float32,1_000_000));

julia> @time v = PooledArray(a);
  0.020283 seconds (16 allocations: 1.908 MiB)

julia> @time sortperm(v);
  0.014729 seconds (17 allocations: 15.260 MiB)

In general I think vectors of integers are faster to sort than vector of floats, so if you can switch to using integers in your application you could have some performance gain.

1 Like

I think that you are running into the general problem that quicksort has quadratic worst case, unless one does a lot of complicated pivot selection. Julia base does not do fancy pivot selection. Quicksort is good on random data, though.

Just use mergesort, or use some package that implements timsort (since you hit a worse than log-linear complexity in your application, it stands to reason that your data is already partially sorted in some way; timsort is really good at using that!).

PS. Demo:

julia> using BenchmarkTools

julia> n=10_000; b=vcat(ones(Float32,n),zeros(Float32,n));

julia> @btime sortperm(b; alg=QuickSort);
  16.900 ms (5 allocations: 156.38 KiB)
julia> @btime sortperm(b; alg=MergeSort);
  396.191 μs (8 allocations: 234.72 KiB)
julia> sp=sortperm(b);
julia> @btime sort(sp; alg=QuickSort);
  7.973 ms (2 allocations: 156.33 KiB)
julia> @btime sort(sp; alg=MergeSort);
  256.369 μs (4 allocations: 234.59 KiB)

julia> n=100_000; b=vcat(ones(Float32,n),zeros(Float32,n));

julia> @btime sortperm(b; alg=QuickSort);
  1.722 s (5 allocations: 1.53 MiB)
julia> @btime sortperm(b; alg=MergeSort);
  5.295 ms (8 allocations: 2.29 MiB)
julia> sp=sortperm(b);
julia> @btime sort(sp; alg=QuickSort);
  874.830 ms (2 allocations: 1.53 MiB)
julia> @btime sort(sp; alg=MergeSort);
  3.602 ms (4 allocations: 2.29 MiB)
2 Likes

quicksort is fine on this data, sort() goes super fast, so it’s not the worst case business. sortperm() has different behavior than sort(), I think because it uses the Perm lessthan function that tries to do stable stuff. does this make sense?

n=1_000_000; b=vcat(ones(Float32,n),zeros(Float32,n));
@time sort(b,alg=QuickSort);
  0.288000 seconds (109.52 k allocations: 13.569 MiB, 29.54% gc time)
@time sort(b,alg=MergeSort);
  0.088643 seconds (11.26 k allocations: 12.060 MiB, 64.68% gc time)
@time sortperm(b,alg=MergeSort);
  0.244369 seconds (106.38 k allocations: 28.538 MiB)
@time sortperm(b,alg=QuickSort);
131.923515 seconds (15.07 k allocations: 16.057 MiB, 0.04% gc time)

It is the worst case business, as I demonstrated by sorting the sortperm.

The reason quicksort is fast on the original data is that it is unstable and can use the instability to avoid the worst case. If you sortperm, then you must use a stable sort, either by algorithm choice or by breaking ties (which is the line you commented out). If you break ties then the resulting array is close to the worst case.

Or would you be happy with non-deterministic results like sortperm([1,1,1,1]) == [3, 1, 4, 2]?

edit: Sorry, I failed at reading comprehension, you already wrote everything I had to say.

Yes, exactly. sort() gets away with using an unstable algorithm on floats and integers because it assumes that isequal float/ints are indistinguishable. They are not, and sorting a vector of BigInt can probably be unstable with respect to ===, but no with respect to ==.

I would be happy with this because this is what the documentation says: " If you choose a non-stable sorting algorithm such as QuickSort , a different permutation that puts the array into order may be returned. " This is also the behavior of np.argsort.

2 Likes

Fair enough, this is a valid point. I still think this would be a big footgun, but you are right that unstable sortperm is currently not offered and could be useful for applications like yours.

Strange… Where did you find this documentation? For Julia 1.1, the doc says that “the permutation is guaranteed to be stable even if the sorting algorithm is unstable, meaning that indices of equal elements appear in ascending order.”

https://docs.julialang.org/en/v1.1/base/sort/#Base.sortperm

Are you using another version? Or another source of documentation?

Ah, interesting. I was using 0.6.4, so my bad. You are right there is a footgun element which maybe why they went in this direction in 0.7+. But man, quadratic performance is a killer. I will continue using my hacked quicksort, and I guess other people can do the same if they need it. Thanks!

Really consider mergesort. Quicksort is generally very dangerous for anything that is not guaranteed random, for exactly this reason, and I do my best to avoid using it for anything since getting hit by this once (big footgun in base). Quicksort is not that much quicker than mergesort for random inputs (maybe 10%-30%).

In principle I’d like to recommend timsort for everything, but the SortingAlgorithms.jl implementation is suprisingly slow on unstructured / random data, compared to base / mergesort (almost 2x time plus lots of garbage collection). I cannot say whether that is inherent in the timsort algorithm or due to implementation problems or due to inappropriate tuning (afaik the the python tuning parameters are used, which may be inappropriate for julia).

1 Like