# How to sort a vector by an expensive function in place efficiently?

The normal `sort!(...;by=f)` sorts in place by a function, but it results in calling that function many more times than the length of the vector. I tried all the sorting algorithms included in Base and all of them resulted in calling the `by` function multiple times the length of the vector. So, this doesn’t work well if the `by` function is expensive.

I have seen the recommendation to use sortperm for this: Sorting and Related Functions · The Julia Language
However, it appears that it is creating 2 new vectors as large as the original. Is there a way to do this more efficiently when one wants to sort the original vector in place?

Sorting in place is described by the `REPL` help with `?sort!`

by

``````  sort!(v; alg::Algorithm=defalg(v), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)

Sort the vector v in place. QuickSort is used by default for numeric arrays while MergeSort is used for other arrays. You can specify an algorithm to use via the alg keyword (see Sorting Algorithms
for available algorithms). The by keyword lets you provide a function that will be applied to each element before comparison; the lt keyword allows providing a custom "less than" function (note
that for every x and y, only one of lt(x,y) and lt(y,x) can return true); use rev=true to reverse the sorting order. These options are independent and can be used together in all possible
combinations: if both by and lt are specified, the lt function is applied to the result of the by function; rev=true reverses whatever ordering specified via the by and lt keywords.
)
``````

So I’m not really understanding what you are after?

I added more details to the question explaining why `sort!` is inappropriate for the case of expensive functions. It calls the function many times more than once per item.

1 Like

If that is the case then you could easily get an improvement by just calling the function itself inplace and then sorting in place.

1 Like

Another possibility is to memoize the function.

You need to store the result of calling `f` on the data somewhere. For example, instead of `sort(x, by=f)`, you could use:

``````x = x[sortperm(f.(x))]
``````

which has the downside that it allocates additional arrays for `f.(x)`, the permutation, and for re-ordering `x`.

To operate more in-place, you could use the StructArrays.jl package to implicitly create an array of tuples of `(f(x[i]), x[i])` values:

``````fx = StructArray((f.(x), x))
sort!(fx, by=first) # also sorts x
``````

For example:

``````x = rand(-9:9, 10)
sort!(StructArray((abs.(x), x)), by=first)
``````

sorts `x` by absolute value, but only calls `abs` once on each element of `x`, at the expense of allocating only one additional array `abs.(x)`.

PS. You could also use something like Memoize.jl to memoize `f`, but this builds a dictionary structure that is probably not as efficient as the code above if `x` contains mostly unique values. On the other hand, memoization will be more efficient than the `f.(x)` above if `x` contains only a few distinct values repeated many times, since then you will only call `f` once per value in `x`.

6 Likes

Thanks for the pointer to StructArray. I had been wondering if there was a way to do a “view” sort of thing like that. I had thought about memoization, but every value is different so it wouldn’t work well for my case.

The timing is rather curious on the three approaches:

``````using StructArray
function texp(x)
for _ in 1:1000
x = log(x^2)
end
x
end

testit1(f, v) = sort!(v, by=f)
testit2(f, v) = v[sortperm(f.(v))]
testit3(f, v) = sort!(StructArray((f.(v), v)), by=first)

v = rand(10000)

julia> @btime testit1(\$texp, \$v);
4.464 s (0 allocations: 0 bytes)

julia> @btime testit2(\$texp, \$v);
220.449 ms (9 allocations: 234.56 KiB)

julia> @btime testit3(\$texp, \$v);
220.919 ms (29 allocations: 157.62 KiB)

v = rand(100000)
julia> @btime testit2(\$texp, \$v);
2.226 s (9 allocations: 2.29 MiB)

julia> @btime testit3(\$texp, \$v);
2.223 s (29 allocations: 1.53 MiB)
``````

The naive sort! is really bad. The timing of the sortperm and StructArray approaches is about the same, but one allocates more bytes, but one allocates more times. And that pattern stays the same when increasing the size of the vector.

I tried using RadixSort for the same code and it resulted in about the same timing but more allocations (both count and size).

1 Like

a further variant, with the same structure as the solution proposed by @stevengj (but without dependencies) , and with the same performance as testt2

``````testit4(f, v) = sort!(tuple.(f.(v), v), by=first)
``````

This doesn’t sort the original array `v`, but only sorts the new array of tuples. In contrast, `sort!(StructArray((f.(v), v)), by=first)` also sorts `v`.

2 Likes

Other than the `StructArray` trick, you could also try `permute!` to reduce allocations (or even `Base.permute!!`, as you don’t need to preserve the result of `sortperm`).

For example

``````sort_by!(f, v) = Base.permute!!(v, sortperm(f.(v)))
``````

should reduce allocations. I’m not sure if it is slower or faster than simply `v[sortperm(f.(v))]` though.

3 Likes

I was looking for something like permute!. Thanks! So, here are the timings after adding two more variants for 100000 entries:

``````testit4(f, v) = Base.permute!!(v, sortperm(f.(v)))
testit5(f, v) = Base.permute!(v, sortperm(f.(v)))

julia> @btime testit2(\$texp, \$v);
1.157 s (9 allocations: 2.29 MiB)

julia> @btime testit3(\$texp, \$v);
1.156 s (29 allocations: 1.53 MiB)

julia> @btime testit4(\$texp, \$v);
1.149 s (7 allocations: 1.53 MiB)

julia> @btime testit5(\$texp, \$v);
1.150 s (9 allocations: 2.29 MiB)
``````

Fun… The permute!! version has similar allocation count to the basic sortperm approach, but has similar allocation bytes to the StructArray approach. It has the lower of both, which would seem to suggest it is the best approach so far.

Be very careful using `@btime` to benchmark mutating sorting functions like this. The benchmarking code will run the same function multiple times on your inputs, which can significantly bias the results for sorting algorithms because every call after the first will be attempting to sort an already sorted array, which is very cheap. This is mentioned in the manual here: Manual · BenchmarkTools.jl

To benchmark something like `sort!`, you need to do:

``````julia> @btime sort!(x) setup=(x = rand(100)) evals=1
1.668 μs (0 allocations: 0 bytes)
``````

Where the `setup` phase generates new (unsorted) data for each run, and `evals=1` tells the benchmark to only run `sort!` once on that data.

4 Likes

And maybe seed the RNG

``````@btime sort!(x) setup=(Random.seed!(42); x = rand(100)) evals=1
``````
1 Like

Yes, but the problem was that `sort` calls the `by` function multiple times per value, and memoization can help you with that.

``````using Memoize

@memoize function texpm(x)
for _ in 1:1000
x = log(x^2)
end
return x
end

1.7.2> v = rand(10000);

1.7.2> @time testit2(texp, v);
0.206719 seconds (9 allocations: 234.562 KiB)

1.7.2> @time sort!(v, by=texpm);
0.219627 seconds (315.41 k allocations: 4.813 MiB)
``````

Some care must be taken when benchmarking with memoization. Next time you work on the same data, you get

``````1.7.2> @time sort!(v; by=texpm);
0.009580 seconds (198.99 k allocations: 3.036 MiB)
``````

because the values are memoized, and the vector is sorted.

It’s unlikely to ever be faster than the other methods though, unless you are repeatedly working on shuffled versions of the same data.

1 Like

That’s good to know about @btime, thanks!

I had thought about it being presorted. However, I’m not trying to benchmark the algorithm itself, but rather the process around it. Also, I thought there might be the possibility of skewing the results if one case got a particular “easy” set of data. But, I suppose it runs enough samples that that is probably not a concern.

``````julia> @btime testit2(\$texp, x) setup=(x=rand(100000)) evals=1;