I am implementing a routine to find ‘a’ median
of an integer array. The algorithm is based on quicksort, and both implementations below partially search/sort about half the array. They both have the same O(n) average complexity but why is the second one so much slower than the first (despite taking the same number of iterations on an average) ? Should I understand that swapping elements in an array is slower than creating new arrays? If so, why? Especially, note that the slower one uses zero memory!
function quickmedian1(a, m=(length(a)+1)÷2, indices=(1:length(a)), left=0)
pivot = a[indices[1]]
indices₋ = [j for j in indices if a[j] < pivot]
indices₌ = [j for j in indices if a[j] == pivot]
if left + length(indices₋) + length(indices₌) < m
indices₊ = [j for j in indices if a[j] > pivot]
return quickmedian1(a, m, indices₊, left+length(indices₋)+length(indices₌))
elseif left + length(indices₋) < m
return pivot
else
return quickmedian1(a, m, indices₋, left)
end
end
function quickmedian2!(a::AbstractArray{I}, m::Int=(length(a)+1)÷2, x::Int=1, y::Int=length(a)) where I
i = x-1
for j in x:y
if (a[j] < a[y]) || (j==y)
i = i+1
a[j], a[i] = a[i], a[j]
end
end
if i > m
return quickmedian2!(a, m, x, i-1)
elseif i < m
return quickmedian2!(a, m, i+1, y)
else
return a[i]
end
end
using Statistics
using BenchmarkTools
K, L = 8, 6
a = @benchmark median($(rand(0:10^K, 10^L)))
println("\n#################\nmedian")
display(a)
a = @benchmark quickmedian1($(rand(0:10^K, 10^L)))
println("\n#################\nquickmedian1")
display(a)
a = @benchmark quickmedian2!($(rand(0:10^K, 10^L)))
println("\n#################\nquickmedian2!")
display(a)
Output:
#################
median
BenchmarkTools.Trial:
memory estimate: 7.63 MiB
allocs estimate: 2
--------------
minimum time: 10.501 ms (0.00% GC)
median time: 10.915 ms (0.00% GC)
mean time: 11.576 ms (0.87% GC)
maximum time: 16.586 ms (1.16% GC)
--------------
samples: 431
evals/sample: 1
#################
quickmedian1
BenchmarkTools.Trial:
memory estimate: 44.78 MiB
allocs estimate: 584
--------------
minimum time: 53.406 ms (0.34% GC)
median time: 62.003 ms (0.36% GC)
mean time: 63.645 ms (3.40% GC)
maximum time: 119.835 ms (49.61% GC)
--------------
samples: 79
evals/sample: 1
#################
quickmedian2!
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 13.134 s (0.00% GC)
median time: 13.134 s (0.00% GC)
mean time: 13.134 s (0.00% GC)
maximum time: 13.134 s (0.00% GC)
--------------
samples: 1
evals/sample: 1
PS: Is there an easy way to show benchmark outputs? Also when benchmark runs multiple times on the mutating function call it is not using new input each time, how to remedy that?