I have an “it works … why?” question. I hope this is the right place.
I’m trying to learn to write julia code to have as good performance as possible (yes, I’m new with julia). Most of the time, I’m trying to reproduce the performance of the julia libraries (not always successfully, of course). However, in my experiments I managed to find an algorithm for finding median, which outperforms the built-in median function. According to benchmarks, my function outperforms the built-in one 1:10.
Here is the code
using BenchmarkTools function medianof4(a::Number, b::Number, c::Number, d::Number) if a > c a, c = c, a end if b > d b, d = d, b end if a > b if c > d return a, d else return a, c end else if c > d return b, d else return b, c end end end function medianof3(a::Number, b::Number, c::Number) if a > c a, c = c, a end if a > b if b > c return b elseif a > c return c else return a end else if b < c return b elseif a > c return a else return c end end end """ linmedian(data) Linear algorithm for finding median of a vector or some part of it. """ function linmedian(data; l=1, h=length(data), mode="middle") ml, odd = divrem(h, 2) mh = ml + 1 tmpl, tmph = data[ml], data[mh] if odd == 0 while l < ml tmpl, tmph = medianof4(data[l], tmpl, tmph, data[h]) l += 1 h -= 1 end if mode == "higher" || mode == "high" || mode == "hi" || mode == "h" return tmph elseif mode == "lower" || mode == "low" || mode == "lo" || mode == "l" return tmpl else return (tmpl + tmph) / 2 end else while l < ml tmph = medianof3(data[l], tmph, data[h]) l += 1 h -= 1 end return tmph end end d = rand(0:99, 12000) @benchmark linmedian($d) @benchmark median($d)
OK, I did optimize everything I knew how, and I’m not testing for consistency of input and all that, but still, I’m surprised. Am I doing something wrong (I double-checked the results, output is correct)? Did I overlook something? Did I, per chance, find a faster algorithm for median?
I would really like to hear your opinion.