So again, I love this solution. It is extremely readable. But I need to create a tuple for extrema which seems like it should require an allocation. For example, this is mathematically equivalent
but requires an allocation and so is 10x slower. I tried median(SArray(v[1],v[1],v[2],v[3],v[4])) and median((v[1],v[2],median((v[1],v[3],v[4])))) but they all allocate.
Is this something special about extrema or something general about Julia I should understand? Or maybe @benchmarktools is lying to me?
If (a-b)*(b-c) >= 0, then both parentheses are >=0, or they are both <=0. Depending on how sensitive the application is to the difference between <= and < (I still don’t quite understand what we are doing here, but it’s some kind of sorting/clamping thing so it shouldn’t matter), you can use xor((a>=b), (b<c)). Or some other combination of comparisons. Shouldn’t matter much for the timings.
We’re ensuring that a number is within the extrema of three other numbers. The clamp function version is certainly the most direct and since it doesn’t use any custom code, it could be used to unit test any of the other versions.
I’m happy to do so but (dumb question) how do I get your PR above? Do I need to git fetch https://github.com/JuliaCI/BenchmarkTools.jl/pull/194/commits/230663872002eae1f9513590f68322636f2fa05d
I’m not sure if there’s a native way to select a specific commit, but ]dev https://github.com/gustaphe/BenchmarkTools.jl/ should do the job for now because it’s on my master branch.
Perhaps a bit of a tangent, but here’s an attempt at at a non-allocating median for tuples, by simply listing all the options. This will end up making the same comparison many times, and only works for short tuples, surely there is a better algorithm:
using Combinatorics, Statistics
function _medex(n)
out = quote end
for p in permutations(1:n)
z = zip(Iterators.repeated(:(<=)), [:(x[$i]) for i in p])
c = Expr(:comparison, Iterators.drop(Iterators.flatten(z),1)...)
v = isodd(n) ? :(x[$(p[1+n÷2])]) : :((x[$(p[n÷2])] + x[$(p[1+n÷2])])/2)
push!(out.args, :($c && return $v))
end
out
end
_medex(3) # to see what this generates
@generated unrollmedian(x::Tuple) = _medex(length(x.parameters))
unrollmedian(xs::Number...) = unrollmedian(xs)
for n in 2:6
x = Tuple(rand(Int8,n))
@show n
@btime median($(Ref(x))[])
@btime unrollmedian($(Ref(x))[]) # at n=7 I have to kill julia
end
using Random, Statistics, BenchmarkTools
Statistics.median(xs::Number...) = median(xs) # to match @gustaphe's functions
Random.seed!(1);
xs = Tuple(rand(Int8, 3)) # (64, 82, -103)
for f in [median, unrollmedian, fastmath_median, fun, mulmedian, compmedian, newcompmedian]
@show f f(xs...)
@btime $f($(Ref(xs))[]...)
end
This comparison prints the following, note that several functions above give the wrong answer, I think they assume that all numbers are positive. Not sure how meaningful these tiny times are:
That works! On my machine all of the double median version are essentially the same. Clamp is a little slower (actually, still pretty much the same) and the single vectorized median is much worse due to the allocation.
@mcabbot, in case it might interest, for larger lists of numbers, a median finding algorithm, seemingly faster than sort-based methods, is presented in a simple way here and the author points also to a more advanced reference on this topic by Andrei Alexandrescu.