Ifelse vs ternary with tuples

Note that the compiler tends to prefer using branches, under the assumption that well predicted code.

One example where this backfires is in binary searches, which are generally impossible to predict.
I get

julia> using Random, BenchmarkTools

julia> x = rand(Int, 2048); s = sort(x);

julia> perm = randperm(length(x));

julia> function findbench(f, x, vals)
           @inbounds for i = eachindex(x, vals)
               v = vals[i]
               f(x[v], x) == v || throw("oops")
           end
       end
findbench (generic function with 1 method)

julia> @benchmark findbench((x,v)->searchsortedfirst(v, x), $s, $perm)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  94.526 μs … 117.188 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     95.964 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   96.209 μs ±   1.087 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

            ▁▂▄▄▆▇█▇▆▆▄▂                                        
  ▂▂▂▂▂▃▄▄▆▇█████████████▆▅▄▄▃▃▂▂▂▂▂▂▂▁▂▂▂▂▂▂▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▂▂ ▄
  94.5 μs         Histogram: frequency by time         99.6 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

However, if I start Julia via JULIA_LLVM_ARGS="-x86-cmov-converter=false" julia, which disables the cmov->branch conversion (so our code is actually branchless), I get

julia> @benchmark findbench((x,v)->searchsortedfirst(v, x), $s, $perm)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  40.656 μs … 62.645 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     40.919 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   41.682 μs ±  2.223 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄█▆                 ▁▁▁               ▂▂▁          ▁▂▁      ▁
  ███▇▃▁▁▃▁▁▃▁▁▁▁▃▄▆▇██████▆▅▄▃▁▄▄▄▄▁▄▃▆███▆▁▁▁▃▁▁▁▁▅████▆▅▅▆ █
  40.7 μs      Histogram: log(frequency) by time      49.9 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

Would be nice if we had a way to tell the compiler “no really, I do want this particular code to be branchless!” without using a hammer that impacts all code like that ENV variable.

8 Likes