The normal sort!(...;by=f) sorts in place by a function, but it results in calling that function many more times than the length of the vector. I tried all the sorting algorithms included in Base and all of them resulted in calling the by function multiple times the length of the vector. So, this doesn’t work well if the by function is expensive.

I have seen the recommendation to use sortperm for this: Sorting and Related Functions · The Julia Language
However, it appears that it is creating 2 new vectors as large as the original. Is there a way to do this more efficiently when one wants to sort the original vector in place?

Sorting in place is described by the REPL help with ?sort!

by

sort!(v; alg::Algorithm=defalg(v), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)
Sort the vector v in place. QuickSort is used by default for numeric arrays while MergeSort is used for other arrays. You can specify an algorithm to use via the alg keyword (see Sorting Algorithms
for available algorithms). The by keyword lets you provide a function that will be applied to each element before comparison; the lt keyword allows providing a custom "less than" function (note
that for every x and y, only one of lt(x,y) and lt(y,x) can return true); use rev=true to reverse the sorting order. These options are independent and can be used together in all possible
combinations: if both by and lt are specified, the lt function is applied to the result of the by function; rev=true reverses whatever ordering specified via the by and lt keywords.
)

So I’m not really understanding what you are after?

I added more details to the question explaining why sort! is inappropriate for the case of expensive functions. It calls the function many times more than once per item.

You need to store the result of calling f on the data somewhere. For example, instead of sort(x, by=f), you could use:

x = x[sortperm(f.(x))]

which has the downside that it allocates additional arrays for f.(x), the permutation, and for re-ordering x.

To operate more in-place, you could use the StructArrays.jl package to implicitly create an array of tuples of (f(x[i]), x[i]) values:

fx = StructArray((f.(x), x))
sort!(fx, by=first) # also sorts x

For example:

x = rand(-9:9, 10)
sort!(StructArray((abs.(x), x)), by=first)

sorts x by absolute value, but only calls abs once on each element of x, at the expense of allocating only one additional array abs.(x).

PS. You could also use something like Memoize.jl to memoize f, but this builds a dictionary structure that is probably not as efficient as the code above if x contains mostly unique values. On the other hand, memoization will be more efficient than the f.(x) above if x contains only a few distinct values repeated many times, since then you will only call f once per value in x.

Thanks for the pointer to StructArray. I had been wondering if there was a way to do a “view” sort of thing like that. I had thought about memoization, but every value is different so it wouldn’t work well for my case.

The timing is rather curious on the three approaches:

using StructArray
function texp(x)
for _ in 1:1000
x = log(x^2)
end
x
end
testit1(f, v) = sort!(v, by=f)
testit2(f, v) = v[sortperm(f.(v))]
testit3(f, v) = sort!(StructArray((f.(v), v)), by=first)
v = rand(10000)
julia> @btime testit1($texp, $v);
4.464 s (0 allocations: 0 bytes)
julia> @btime testit2($texp, $v);
220.449 ms (9 allocations: 234.56 KiB)
julia> @btime testit3($texp, $v);
220.919 ms (29 allocations: 157.62 KiB)
v = rand(100000)
julia> @btime testit2($texp, $v);
2.226 s (9 allocations: 2.29 MiB)
julia> @btime testit3($texp, $v);
2.223 s (29 allocations: 1.53 MiB)

The naive sort! is really bad. The timing of the sortperm and StructArray approaches is about the same, but one allocates more bytes, but one allocates more times. And that pattern stays the same when increasing the size of the vector.

I tried using RadixSort for the same code and it resulted in about the same timing but more allocations (both count and size).

Other than the StructArray trick, you could also try permute! to reduce allocations (or even Base.permute!!, as you don’t need to preserve the result of sortperm).

Fun… The permute!! version has similar allocation count to the basic sortperm approach, but has similar allocation bytes to the StructArray approach. It has the lower of both, which would seem to suggest it is the best approach so far.

Be very careful using @btime to benchmark mutating sorting functions like this. The benchmarking code will run the same function multiple times on your inputs, which can significantly bias the results for sorting algorithms because every call after the first will be attempting to sort an already sorted array, which is very cheap. This is mentioned in the manual here: Manual · BenchmarkTools.jl

To benchmark something like sort!, you need to do:

Yes, but the problem was that sort calls the by function multiple times per value, and memoization can help you with that.

using Memoize
@memoize function texpm(x)
for _ in 1:1000
x = log(x^2)
end
return x
end
1.7.2> v = rand(10000);
1.7.2> @time testit2(texp, v);
0.206719 seconds (9 allocations: 234.562 KiB)
1.7.2> @time sort!(v, by=texpm);
0.219627 seconds (315.41 k allocations: 4.813 MiB)

Some care must be taken when benchmarking with memoization. Next time you work on the same data, you get

I had thought about it being presorted. However, I’m not trying to benchmark the algorithm itself, but rather the process around it. Also, I thought there might be the possibility of skewing the results if one case got a particular “easy” set of data. But, I suppose it runs enough samples that that is probably not a concern.